The AI-Ready Data Engineer

Chapter 5: Resources and Milestones

Curated AI resources for data engineers: top courses from DeepLearning.AI and Fast.ai, essential GitHub repos, YouTube channels, communities, documentation, and progress milestones to track your growth.

Courses worth your time#

Building and Evaluating Advanced RAG — DeepLearning.AI
LangChain for LLM Application Development
Functions, Tools and Agents with LangChain
Knowledge Graphs for RAG
Preprocessing Unstructured Data for LLM Applications
DeepLearning.AI — Andrew Ng's team, consistently excellent
Fast.ai — Free and better than most paid courses
Complete Agentic AI Bootcamp — Udemy, solid content
Coursera AI courses — IBM and Stanford courses are legit
DataCamp — Good for beginners, nice UI
LangChain Academy — Free and directly from the source
Hugging Face AI Agents Course — Free, certified, covers smolagents, LlamaIndex, and LangGraph
Hugging Face LLM Course — Free, covers Transformers, fine-tuning, and reasoning models

GitHub repos to star immediately#

AltimateAI/altimate-code — Open-source agentic data engineering harness
artidoro/qlora — Efficient fine-tuning
axolotl-ai-cloud/axolotl — Fine-tuning made easy
confident-ai/deepeval — LLM testing framework
crewAIInc/crewAI — Multi-agent orchestration
dair-ai/Prompt-Engineering-Guide — Comprehensive prompting
explodinggradients/ragas — RAG evaluation
langchain-ai/langgraph — Stateful agents
langchain-ai/rag-from-scratch — RAG fundamentals
microsoft/AI-For-Beginners — Surprisingly comprehensive
modelcontextprotocol/servers — MCP integrations
NirDiamant/GenAI_Agents — Production agent examples
NirDiamant/RAG_Techniques — Every RAG pattern you need
openai/evals — Evaluation framework
openai/openai-agents-python — OpenAI Agents SDK
promptfoo/promptfoo — Prompt testing and red-teaming
stanfordnlp/dspy — Automated prompt engineering
unslothai/unsloth — Fast, memory-efficient fine-tuning

YouTube channels that don't waste your time#

Andrej Karpathy — Ex-OpenAI, pure technical content
Sam Witteveen — LangChain specialist, great energy
Matthew Berman — Open source focus, daily videos
Yannic Kilcher — Best paper explanations
Krish Naik — Practical implementations
DeepLearning.AI — Official channel with free courses
FreeCodeCamp — Long-form tutorials
3Blue1Brown — Best conceptual explanation of how neural networks actually work

Communities where real learning happens#

DataTalks.Club Slack — 13k+ members, super active
r/LocalLLaMA — Where the open source community lives
r/MachineLearning — Academic discussions
Hugging Face Discord — The most active open-source ML community; directly tied to the tools you're using
Twitter/X — Follow @karpathy, @sama, @ylecun, @emollick

Additional documentation and guides#

OpenAI Cookbook — Practical examples
Anthropic Claude docs — Excellent MCP documentation
Altimate Code docs — Agentic data engineering harness
Hugging Face documentation — Models and datasets
Google AI documentation — Gemini and more
AWS AI/ML resources — Comprehensive guides
Microsoft AI docs — Azure AI services
Towards Data Science — Community articles
Ben's Bites — Best daily AI newsletter, period (according to Anand!)
The Rundown AI — Good alternative to Ben's
Chip Huyen's blog — Production ML wisdom
Papers with Code — Implementation paradise
The Gradient — Thoughtful long-form content
Latent Space — AI engineering focused
Sebastian Raschka — Best paper summaries
Alpha Signal — Daily digest of top repos, papers, and models; strong signal-to-noise for staying current

After a weekend#

You've built a working RAG app
You understand agents and can build basic ones
You can contribute meaningfully to AI discussions
You know the landscape and key concepts

After a month#

You can architect AI systems
You understand multiple frameworks well
You can evaluate and debug AI applications effectively

After three months#

You're comfortable leading AI initiatives
You're contributing to open source projects
You understand production deployment deeply
You can design and implement complex AI systems

Previous4. The Three-Month Advanced Track NextClosing thoughts