• About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, January 11, 2026
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home Machine Learning

Reducing Time to Value for Data Science Projects: Part 4

AiNEWS2025 by AiNEWS2025
2025-08-13
in Machine Learning
0
Reducing Time to Value for Data Science Projects: Part 4
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


series in reducing the time to value of your projects (see part 1, part 2 and part 3) takes a less implementation-led approach and instead focusses on the best practises of developing code. Instead of detailing what and how to code explicitly, I want to talk about how you should approach development of projects in general which underpins everything that has been covered previously.

Introduction

Being a data scientist involves bringing together lots of different disciplines and applying them to drive value for a business. The most commonly prized skill of a data scientist is the technical ability to produce a trained model ready to go live. This covers a wide range in required knowledge such as exploratory data analysis, feature engineering, data transformations, feature selection, hyperparameter tuning, model training and model evaluation. Learning these steps alone are a significant undertaking, especially in the constantly evolving world of Large Language Models and Generative AI. Data scientists could devote all their learning to becoming technical powerhouses, knowing the inner working of the most advanced models.

While being technically proficient is important, there are other skills that should be developed if you want be a truly great data scientist. The chief amongst these is being a good software developer. Being able to write robust, flexible and scalable code is just as important, if not more so, than knowing all the latest techniques and models. Lacking these software skills will allow bad practises to creep into your work and you will end up with code that may not be suitable for production. Embracing software development principles will give a structured way of ensuring your code is high quality and will speed up the overall project development process.

This article will serve as a brief introduction to topics that multiple books have been written about. As such I do not expect this to be a comprehensive breakdown of everything software development; instead I want this to merely be a starting point in your journey in writing clean code that helps to drive forward value for your business.

Set Up Your DevOps Platform Properly

All data scientists are taught to use Git as part of their education to carry out tasks such as cloning repositories, creating branches, pulling / pushing changes etc. These tend to be backed by platforms such as GitHub or GitLab, and data scientists are content to use these purely as a place to store code remotely. However they have significantly more to offer as fully fledged DevOps platforms, and using them as such will greatly improve your coding experience.

Assigning Roles To Team Members In Your Repository

Many people will want or need to access your project repository for different purposes. As a matter of security, it is good practice to limit how each person can interact with it. The roles that people can take typically fall into categories such as:

  • Analyst: Only needs to be able to read the repository
  • Developer: Needs to be able to read and write to the repository
  • Maintainer: Needs to be able to edit repository settings

For data scientists, you should have more senior members of staff on the project be maintainers and junior members be developers. This becomes important when deciding who can merge changes into production.

Managing Branches

When developing a project with Git, you will make extensive use of branches that add features / develop functionality. Branches can split into different categories such as:

  • main/master: Used for official production releases
  • development: Used to bring together features and functionality
  • features: What to use when doing code development work
  • bugfixes: Used for minor fixes
Proper management of branching structure simplifies the development process. Image by author

The main and development branches are special as they are permanent and represent the work that is closest to production. As such special care must be taken with these, namely:

  • Ensure they cannot be deleted
  • Ensure they cannot be pushed to directly
  • They can only be updated via merge requests
  • Limit who can merge changes into them

We can and should protect these branches to enforce the above. This is normally the job of project maintainers.

When deciding merge strategies for adding to development / main we need to consider:

  • Who is allowed to trigger and approve these merges (specific roles / people?)
  • How many approvals are required before a merge is accepted?
  • What checks does a branch need to pass to be accepted?

In general we may have less strict controls for updating development vs updating main but it is important to have a consistent strategy in place.

When dealing with feature branches you need to consider:

  • What will the branch be called?
  • What is the structure to the commit messages?

What is important is to agree as a team the guidelines for naming branches. Some examples could be to name them after a ticket, to have a common list of prefixes to start a branch with or to add a suffix at the end to easily identify the owner. For the commit messages, you may want to use a 3rd party library such as Commitizen to enforce standardisation across the team.

Maintain a Consistent Development Environment

Taking a step back, developing code will require you to:

  • Have access to the programming languages software developer kit
  • Install 3rd party libraries to develop your solution

Even at this point care must be taken. It is all too common to run into the scenario where solutions that work locally fail when another team member tries to run them. This is caused by inconsistent development environments where:

  • Different version of the programming language are installed
  • Different versions of the 3rd party library are installed

Ensuring that everyone is developing within the same environment that replicates the production conditions will ensure we have no compatibility issues between developers, the solution will work in production and will eliminate the need for ad-hoc installation of libraries. Some recommendations are:

  • Use a requirements.txt / pyproject.toml at a minimum. No pip installing libraries on the fly!
  • Look into using docker / containerisation to have fully shippable environments
Consistent environments and libraries ensures reproducibility and reduces friction. Image by author

Without these standardisations in place there is no guarantee that your solution will work when deployed into production

Readme.md

Readme’s are the first thing that are seen when you open a project on your DevOps platform. It gives you an opportunity to provide a high level summary of your project and informs your audience how to interact with it. Some important sections to put in a readme are:

  • Project title, description and setup to get people onboarded
  • How to run / use so people can use any core functionality and interpret the results
  • Contributors / point of contact for people to follow up with
A one-stop shop to getting users onboarded onto your project. Image by author

A readme doesn’t need to be extensive documentation of everything relevant to a project, merely a quick start guide. More detailed background, experimental results etc can be hosted somewhere else, such as an internal Wiki like Confluence.

Test, Test And Test Some More!

Anyone can write code but not everyone can write correct and maintainable code. Ensuring that your code is bug free is critical and every precaution should be taken to mitigate this risk. The simplest way to do this is to write tests for whatever code you develop. There are different varieties of tests you can write, such as:

  • Unit tests: Test individual components
  • Integration tests: Test how the individual components work together
  • Regression tests: Test that any new changes haven’t broken existing functionality

Writing a good unit test is reliant on a well written function. Functions should try to adhere to principles such as Do One Thing (DOT) or Don’t Repeat Yourself (DRY) to ensure that you can write clear tests. In general you should test to:

  • Show the function working
  • Show the function failing
  • Trigger any exceptions raised within the function

Another important aspect to consider is how much of your code is tested aka the test coverage. While achieving 100% coverage is the idealised scenario, in practise you may have to settle for less which is okay. This is common when you are coming into an existing project where standards haven’t been properly maintained. The important thing is to start with a coverage baseline and then try and increase that over time as your solution matures. This will involve some technical debt work to get the tests written.

pytest --cov=src/ --cov-fail-under=20 --cov-report term --cov-report xml:coverage.xml --junitxml=report.xml tests

This example pytest invocation both runs the tests and checks that a minimum level of coverage has been attained.

Code Reviews

The single most important part of writing code is having it reviewed and approved by another developer. Having code looked at ensures:

  • The code produced answers the original question
  • The code meets the required standards
  • The code uses an appropriate implementation

Code reviewing data science projects may involve extra steps due to its experimental nature. While this is far for an exhaustive list, some general checks are:

  • Does the code run?
  • Is it tested sufficiently?
  • Are appropriate programming paradigms and data structures used?
  • Is the code readable?
  • Is it code maintainable and extensible?
def bad_function(keys, values, specifc_key):
 
    for i, key in enumerate(keys):
        if key == specific_key:
            value[i] = X
    return keys, values

The above code snippets highlights a variety of bad habits such as using lists instead of dictionary and no typehints or docstrings. From a data science perspective you will additionally want to check:

  • Are notebooks used sparingly and commented appropriately?
  • Has the analysis been communicated sufficiently (e.g. graphs labelled, dataframes described etc.)
  • Has care been taken when producing models (no data leakage, only using features available at inference etc.)
  • Are any artefacts produced and are they stored appropriately?
  • Are experiments carried out to a high standard, e.g. set out with a research question, tracked and documented?
  • Are there clear next steps from this work?

There will come a time where you move off the project onto other things, and someone else will take over. When writing code you should always ask yourself:

How easy would it be for someone to understand what I have written and be comfortable with maintaining or extending functionality?

Use CICD To Automate The Mundane

As projects grow in size, both in people and code, having checks and standards becomes more and more important. This is typically done through code reviews and can involve tasks like checking:

  • Implementation
  • Testing
  • Test Coverage
  • Code Style Standardization

We additionally want to check security concerns such as exposed API keys / credentials or code that is vulnerable to malicious attack. Having to manually check all of these for each code review can quickly become time consuming and could also lead to checks being overlooked. A lot of these checks can be covered by 3rd party libraries such as:

  • Black, Flake8 and isort
  • Pytest

While this alleviates some of the reviewers work, there is still the problem of having to run these libraries yourself. What would be better is the ability to automate these checks and others so that you no longer have to. This can allow code reviews to be more focussed on the solution and implementation. This is exactly where Continuous Integration / Continuous Deployment (CICD) comes to the rescue.

Automating checks frees up developer time. Image by author

There are a variety of CICD tools available (GitLab Pipelines, GitHub Actions, Jenkins, Travis etc) that allow the automation of tasks. We could go further and automate tasks such as building environments and even training / deploying models. While CICD can encompasses the whole software development process, I hope I have motivated some useful examples for its use in improving data science projects.

Conclusion

This article concludes a series where I have focussed on how we can reduce the time to value for data science projects by being more rigorous in our code development and experimentation strategies. This final article has covered a wide range of topics related to software development and how they can be applied within a data science context to improve your coding experience. The key areas focussed on were leveraging DevOps platforms to their full potential, maintaining a consistent development environment, the importance of readme’s and code reviews and leveraging automation through CICD. All of these will ensure that you develop software that is robust enough to help support your data science projects and provide value to your business as quickly as possible.

Source link

#Reducing #Time #Data #Science #Projects #Part

Tags: Best PracticesEditors PickProgrammingProject ManagementSoftware Engineering
Previous Post

Space Force officials take secrecy to new heights ahead of key rocket launch

Next Post

Windows 11 Delivers Built-In Security for Higher Education and Beyond

AiNEWS2025

AiNEWS2025

Next Post
Windows 11 Delivers Built-In Security for Higher Education and Beyond

Windows 11 Delivers Built-In Security for Higher Education and Beyond

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
Can One AI Platform Replace Your Creative Tool Stack?

Can One AI Platform Replace Your Creative Tool Stack?

2026-01-10
Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

2026-01-10
Conservative lawmakers want porn taxes. Critics say they’re unconstitutional.

Conservative lawmakers want porn taxes. Critics say they’re unconstitutional.

2026-01-10
Elon Musk says he’s going to open-source the new X algorithm next week

Elon Musk says he’s going to open-source the new X algorithm next week

2026-01-10

Recent News

Can One AI Platform Replace Your Creative Tool Stack?

Can One AI Platform Replace Your Creative Tool Stack?

2026-01-10
Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

2026-01-10
Conservative lawmakers want porn taxes. Critics say they’re unconstitutional.

Conservative lawmakers want porn taxes. Critics say they’re unconstitutional.

2026-01-10
Elon Musk says he’s going to open-source the new X algorithm next week

Elon Musk says he’s going to open-source the new X algorithm next week

2026-01-10
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

Can One AI Platform Replace Your Creative Tool Stack?

Can One AI Platform Replace Your Creative Tool Stack?

2026-01-10
Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

Federated Learning, Part 1: The Basics of Training Models Where the Data Lives

2026-01-10
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.