Data Roles, Small Language Models, Knowledge Graphs, and More: Our January Must-Reads

Dr. Owns

January 30, 2025

The Variable is moving soon—sign up here to ensure you receive all future newsletters.

Our prolific authors delivered some excellent work this past month, channeling all the renewed energy and excitement we’ve come to expect from January on TDS. From career advice to core programming and data-processing tasks, our most-read and -shared articles in the past month cover the topics that data professionals care about the most as they plan their next move and aim to expand their skill set.

We invite you to explore this month’s must-reads with an open mind: from the ever-shifting terrain of job descriptions to the rise of small language models (alongside large ones), they tackle well-covered areas in data science and machine learning from a fresh, actionable, and pragmatic perspective. Let’s get started.

  • How to Pick Between Data Science, Data Analytics, Data Engineering, ML Engineering, and SW Engineering
    “When the job titles sound so similar and the roles have a good amount of overlap” it can be difficult to choose the right path for your own interests and priorities as a data practitioner. Marina Wyss – Gratitude Driven’s clear and detailed overview will help you make an informed decision.
  • Your Company Needs Small Language Models
    Is it time to reassess the axiom that in AI, bigger is always better? Sergei Savvov makes a compelling case for the growing footprint of small language models in industry contexts, outlining the ways “they can reduce costs, improve accuracy, and maintain control of your data,” and urges us to stay mindful of these models’ current limitations.
  • The Large Language Model Course
    For anyone whose new year’s resolutions included expanding their knowledge of (and practical experience with) LLMs, Maxime Labonne’s comprehensive course is the one-stop resource you’ll need to get started—it offers a well-structured curriculum that assumes no advanced knowledge, and comes full of recommended articles, tutorials, and tools.
Photo by Rima Kruciene on Unsplash

Our latest cohort of new authors

Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions from the past couple of months, including Ramsha Ali, Derick Ruiz, Dr. Marcel Müller, Rodrigo M Carrillo Larco, MD, PhD, Ilona Hetsevich, Federico Zabeo, Vladyslav Fliahin, Jérôme DIAZ, Mandeep Kular, Glenn Kong, Vladimir Kukushkin, Viktor Malyi, Ruben Broekx, Iqbal Hamdi, Richa Gadgil, Piotr Gruszecki, Jonathan Fürst, Sirine Bhouri, Kyoosik Kim, Sunghyun Ahn, Afjal Chowdhury, Tim Wibiral, Kunal Santosh Sawant, Aman Agrawal, Abdelkader HASSINE, Florian Trautweiler, Mohammed AbuSadeh, Loic Merckel, Lukasz Gatarek, Zombor Varnagy-Toth, Marc Matterson, Manelle Nouar, Paula LC, Shitanshu Bhushan, Matthew Senick, Lewis James | Data Science, Clara Chong, Bilal Ahmed, Pavel Krautsou, Erol Çıtak, Cristovao Cordeiro, Vladimir Zhyvov, Yuval Gorchover, Zach Flynn, Allon Korem | CEO, Bell Statistics, Tony Albanese, Sandra E.G., Miguel Cardona Polo, James Thorn, Vineet Upadhya, Kaushik Rajan, Mahmoud Abdelaziz, PhD, Benjamin Assel, Shirley Li, Marina Wyss – Gratitude Driven, Michal Davidson, Rémy Garnier, Uladzimir Yancharuk, David Lindelöf, Ricardo Ribas, Hunjae Timothy Lee, Ashley Peacock, Rohit Ramaprasad, Alejandro Alvarez Pérez, David Martin, Ben Tengelsen, César Ortega Quintero, Jaemin Han, Max Surkiz, Massimo Capobianco, Tobias Cabanski, Jimin Kang, Felix Schmidt, Paolo Molignini, PhD, Sayali Kulkarni, Alan Nekhom, and Chris Lettieri, among others.

Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.

Until the next Variable,

TDS Team


Data Roles, Small Language Models, Knowledge Graphs, and More: Our January Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

​The Variable is moving soon—sign up here to ensure you receive all future newsletters.Our prolific authors delivered some excellent work this past month, channeling all the renewed energy and excitement we’ve come to expect from January on TDS. From career advice to core programming and data-processing tasks, our most-read and -shared articles in the past month cover the topics that data professionals care about the most as they plan their next move and aim to expand their skill set.We invite you to explore this month’s must-reads with an open mind: from the ever-shifting terrain of job descriptions to the rise of small language models (alongside large ones), they tackle well-covered areas in data science and machine learning from a fresh, actionable, and pragmatic perspective. Let’s get started.How to Pick Between Data Science, Data Analytics, Data Engineering, ML Engineering, and SW Engineering“When the job titles sound so similar and the roles have a good amount of overlap” it can be difficult to choose the right path for your own interests and priorities as a data practitioner. Marina Wyss – Gratitude Driven’s clear and detailed overview will help you make an informed decision.Your Company Needs Small Language ModelsIs it time to reassess the axiom that in AI, bigger is always better? Sergei Savvov makes a compelling case for the growing footprint of small language models in industry contexts, outlining the ways “they can reduce costs, improve accuracy, and maintain control of your data,” and urges us to stay mindful of these models’ current limitations.The Large Language Model CourseFor anyone whose new year’s resolutions included expanding their knowledge of (and practical experience with) LLMs, Maxime Labonne’s comprehensive course is the one-stop resource you’ll need to get started—it offers a well-structured curriculum that assumes no advanced knowledge, and comes full of recommended articles, tutorials, and tools.Photo by Rima Kruciene on Unsplash5 Simple Projects to Start Today: A Learning Roadmap for Data EngineeringFor all the aspiring data engineers out there, Sarah Lea outlines a realistic, four-month plan that covers all the essentials. Most importantly, it guides you through several project ideas so that you not only grow your theoretical knowledge, but also get to practice the concepts and workflows you’re learning about.How to Build a Knowledge Graph in Minutes (And Make It Enterprise-Ready)Looking to apply a similar hands-on approach to knowledge graphs? After facing major setbacks in the past, Thuwarakesh Murallie shows how you can build one on your own by leveraging the power of LLMs.Top 12 Skills Data Scientists Need to Succeed in 2025“Many things are changing, but other things are not. Understanding which changes require your attention is the key to success.” Benjamin Bodner breaks down the skills that remain essential for data professionals amid the disruptive pressure of AI tools.Deep Dive into Multithreading, Multiprocessing, and AsyncioWhether you’re taking your first steps in Python or are already a seasoned programmer, there’s always more to learn; Clara Chong’s recent post zooms in on concurrency models—approaches to handle multiple tasks simultaneously—and unpacks the stakes of choosing the right one, depending on your project’s needs.How to Run Jupyter Notebooks and Generate HTML Reports with Python ScriptsFor another Python-focused tutorial that foregrounds the ability to streamline tedious workflows with code, don’t miss Amanda Iglesias Moreno’s guide to automating Jupyter Notebook execution and report generation (with a helpful detour through synthetic-data creation).My Experience Switching From Power BI to Looker (as a Senior Data Analyst)It’s often difficult to assess the tradeoffs and gains you’ll be making by adopting a new tool without first trying different options—or hearing firsthand accounts from professionals in a similar situation to yours. Tomas Jancovic (It’s AI Thomas) walks us through the journey of moving to Looker for his data analytics workflows, and presents a balanced, concrete account of its pros and cons.Three Important Pandas Functions You Need to KnowNew models and generative-AI apps come and go, but Pandas is still here with us, playing an important role in data scientists’ day-to-day work. Jiayan Yin recently shared a helpful guide for early-stage learners, zooming in on three essential functions that you’ll likely turn to again and again as you process and analyze datasets.Our latest cohort of new authorsEvery month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions from the past couple of months, including Ramsha Ali, Derick Ruiz, Dr. Marcel Müller, Rodrigo M Carrillo Larco, MD, PhD, Ilona Hetsevich, Federico Zabeo, Vladyslav Fliahin, Jérôme DIAZ, Mandeep Kular, Glenn Kong, Vladimir Kukushkin, Viktor Malyi, Ruben Broekx, Iqbal Hamdi, Richa Gadgil, Piotr Gruszecki, Jonathan Fürst, Sirine Bhouri, Kyoosik Kim, Sunghyun Ahn, Afjal Chowdhury, Tim Wibiral, Kunal Santosh Sawant, Aman Agrawal, Abdelkader HASSINE, Florian Trautweiler, Mohammed AbuSadeh, Loic Merckel, Lukasz Gatarek, Zombor Varnagy-Toth, Marc Matterson, Manelle Nouar, Paula LC, Shitanshu Bhushan, Matthew Senick, Lewis James | Data Science, Clara Chong, Bilal Ahmed, Pavel Krautsou, Erol Çıtak, Cristovao Cordeiro, Vladimir Zhyvov, Yuval Gorchover, Zach Flynn, Allon Korem | CEO, Bell Statistics, Tony Albanese, Sandra E.G., Miguel Cardona Polo, James Thorn, Vineet Upadhya, Kaushik Rajan, Mahmoud Abdelaziz, PhD, Benjamin Assel, Shirley Li, Marina Wyss – Gratitude Driven, Michal Davidson, Rémy Garnier, Uladzimir Yancharuk, David Lindelöf, Ricardo Ribas, Hunjae Timothy Lee, Ashley Peacock, Rohit Ramaprasad, Alejandro Alvarez Pérez, David Martin, Ben Tengelsen, César Ortega Quintero, Jaemin Han, Max Surkiz, Massimo Capobianco, Tobias Cabanski, Jimin Kang, Felix Schmidt, Paolo Molignini, PhD, Sayali Kulkarni, Alan Nekhom, and Chris Lettieri, among others.Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.Until the next Variable,TDS TeamData Roles, Small Language Models, Knowledge Graphs, and More: Our January Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.  data-science, tds-features, the-variable, machine-learning, monthly-edition Towards Data Science – MediumRead More

How useful was this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.

FavoriteLoadingAdd to favorites

Dr. Owns

January 30, 2025

Recent Posts

0 Comments

Submit a Comment