Our Principal Data Scientist, Joanna McKenzie, gives her thoughts on how to manage data projects and offers the opportunity to learn more in her video, “Managing Data Projects”.
Project management in data doesn’t really have the same glamour as “data scientist”, but the longer I’ve worked with data the more important project management skills have become. There’s a lot to know, on the project management front: different project management approaches which are used in different circumstances. Different contexts, too, demand different approaches.
Formal processes are possible when projects are consistent
I started my analyst career working as a wind analyst for renewable energy projects. This was different to my current data science role in a number of ways, but one was simply that there was one analysis methodology which was repeated again and again for different projects. In such an environment, it’s fairly straightforward to put processes in place which help constrain the analysis: you do job 1 with tool A, model process 2 with tool B, and so on. You can include best practice guidelines for quality control, and have detailed report templates for different deliverables. Because the process is fairly constant there’s space for all kinds of ways to support best practice.
My current role as a data scientist is different – each analysis I perform is new. I’m exploring a new dataset, with a new goal and context every time. The approach of having formal processes and templates for everything is not only no longer useful, it’s completely out of reach: keeping a range of processes and templates up to date would be too much work and too much time to be worthwhile.
Project management is still extremely valuable
It’s tempting to think that this means project management of any sort is not valuable, but in fact project management is pretty important. Project management is how you communicate with your clients, users, stakeholders and colleagues. It’s how you measure your progress and store your files. It’s how you ensure you can check for errors and correct them, maintaining the correction through any later work. It’s how you create projects that you can pick up and put down as the demands on your time ebb and flow. It’s also key to how you can be consistently confident that the insight you communicate is genuine insight and not an artefact of poor processes.
Learn more about project management approaches
I recently recorded some of my thoughts about basic project management for the University Of Glasgow’s maths and statistics department. The talk was aimed at people who need to learn how to approach a data project with professionalism, and to provide them with some structure they can apply in their work. I think these basic project management approaches will stand a data scientist in good stead; used alongside tools like github for code management, they can really support a much better, more consistent approach in your work.
The challenges of project management with data projects
That said, more advanced project management in data projects also fascinates me. I have worked with waterfall projects in the past – that’s the methodology you would use for large capital projects such as building wind farms – and I’m pretty convinced it’s a poor choice for a data project. That said, I also think an agile approach has some limitations when it comes to data. Data projects tend to rely on the work of individual data scientists which doesn’t meld into teamwork too well, and the daily standups and short sprints of an agile project can be frustrating when combined with the subtle complexities of a unique dataset. I am currently working on a way to apply design thinking to data projects, something I think is potentially very useful for moving from scoping a project through to productionising it, even if it doesn’t particularly impact the actual modelling or coding used for the project.
I think all projects benefit from a level of pragmatism and flexibility in the project manager’s approach, and perhaps that is particularly true of agile projects. I am the sort of person who becomes excited by unanswered questions, and I do think we’re still looking for a project management approach that really works for data projects. It’s an unanswered question, and it’s exciting.