As most interesting people, Loïc is a bit of an oxymoron; deeply numerical and statistical, but also creative. Our paths briefly crossed in Amsterdam in 2009, when I became obsessed with his ski jacket that had a hood with small windows (!) on the side that you could see through. Absurdly functional. I got to know him better in London, even though our paths cross too rarely these days. We spoke in early June as London was tentatively opening up after the Covid-19 lockdown.
Since data and analytics in some way stand at the opposite end of my interest in narrative and storytelling, it holds a certain kind of allure to me; a field that feels both foreign, slightly exotic and at the same time completely understandable. After all, these days it’s hard to get anything done without some numbers backing you up.
I was interested to see if Loïc could help demystify some of the talk about machine learning, AI and algorithms. And we did get to that, but first we discussed the travel industry, since this is where Loïc is plying his trade at the moment. So, what can he tell us about data science at one of the world’s largest online travel agencies (OTA), Expedia?
“I develop forecasting and anomaly detection algorithms, which can be based on metrics such as how many bookings we expect tomorrow or travel patterns between countries. We try and predict what will happen in the future, so it’s all about generating value from forecasts. It’s really interesting since there is so much data that can influence what is going to happen.”
Since the Covid-19 outbreak he has been busy looking into how this might impact future travel patterns. Unsurprisingly, the kind of trips people search for has changed following lockdown across the world. Loïc’s algorithms tend to focus on real time data, but they are less helpful when trying to understand longer-term trends.
“Yesterday I was looking at flights to Toulouse [Loïc’s home town] and everything was sold out. Usually I can get a cheap ticket with a low-cost airline two or three months in advance. The price has not actually changed, but everything is sold out. It’s probably because they decreased the number of flights, so you have more demand than offers now. They probably wanted to mitigate the risk of flying half-empty planes.”
If people are not allowed to travel to certain countries, they might actually start looking at what is closer to them.
“Sometimes I talk to friends who were born in the UK and it seems I’m more interested in travelling in the UK than they are; it’s more exotic for me. They might know France better than I do instead. Isn’t it funny?“
Looking at data science from a more general perspective; what does Loïc think about the fear some people have that he and his merry band of data scientists will come into organisations with their algorithms and effectivise away people’s jobs? Loïc says that more and more companies are beginning to see the importance of data, so while there is no nervousness in his industry, he doesn’t think people in any field should view data science as a job destroying activity.
“Interpreting and communicating these algorithms takes a lot of work, so let’s think about how we can use algorithms effectively. We’re far from just being able to click on a button. My advice would be that everyone should think about how their mindset can be data driven. There is a shift that needs to happen, it’s like when engineers used to calculate things with pen and paper. Now they analyse software to get new ideas instead.”
As the person who drove into a lake after following their GPS will know, being data driven can come with risks. What are some important things to consider when taking a data centric approach?
“The main risk is bias, that an algorithm learns something based on a small sample that can cannot be replicated when using a different data set. Bias can be really dangerous since you might not see it the first time. You might need to take a few steps back to realise what might happen. This is why we say that algorithms are powerful when trained on a small sample which can then be generalised on other data samples.
It’s like if you are trying to tell a kid the difference between a cat and a dog. You show all these pictures of cats and dogs, and the kid will eventually learn to tell the difference. But then you suddenly show a picture of a snake, and a child will realise it’s neither a cat nor a dog. But an algorithm might still want to stick to the original categories and say that it’s a cat or a dog. It will never come up with snake since you haven’t shown it a snake before. The kid might also not know that it’s a snake, but it will know that it’s not a cat or dog.”
So why is it that algorithms based on data, which some might see as unbiased in their very nature, can be just as biased as people? For Loïc it comes down to the data source; if the data you put in is biased the algorithm is likely to produce biased results.
“Avoiding bias in algorithms is a massive research field, and I’m no expert, but you need to test your algorithm. You need to think broad enough and outside the box. You need to know the data and understand the data to come up with a solution.”
When it comes to algorithms working together with people in other fields is key; the data scientist needs to work with a person who is an expert on the data, who knows what it means and who is aware of any inherent complexities and issues.
“If you don’t have that knowledge or expertise, you’re not doing data science, one will argue that you are only doing machine learning. The first step is knowing your data, if it’s complicated medical data for example you need an expert.”
So, if data is the new oil, what can people who might not be interested in data science do to start understanding its potential benefits?
“I would go straight to use cases from different companies or look at what is happening in logistics, finance or sport. They can give you a high-level understanding of how they use data to generate value. Once you understand this you don’t need any technical expertise, since there are people like me who have that. Look at something you like; if you like football for example, Liverpool is using data to generate value by looking at the weather and the tactics of their opponents for example. You don’t need to understand the algorithm but the fact that you might need to adapt against a certain team since it increases the statistical likelihood of winning against them is valuable information.”
For those familiar with the English Premier League, and Liverpool’s dominant march to their first league title in 30 years, that seems like sound advice indeed.
You can follow Loïc on LinkedIn.