The Center for Data Innovation spoke to Darja Gutnick and Britta Weber, co-founders of 12grapes, a Berlin-based startup that is developing a job recruitment platform that uses artificial intelligence. Gutnick & Weber talked about the how facial analysis can help build effective teams and how AI reduces bias in recruitment.
Nick Wallace: 12grapes uses artificial intelligence and video analysis in interviews to generate recommendations for recruiters interviewing people for jobs. Can you tell us a little more about how this works? What are you looking for in the candidate’s face?
Britta Weber: Our aim is to map facial expressions to personality traits, such as achievement motivation, extroversion, and so on. That’s the first step. Later we want to attach norms and values. So right now we use existing third-party software for the analysis of facial expressions, which tracks “landmarks,” like the nose and cheeks, and identifies a person’s emotional responses. It comes from the market research field. Our aim is to use these features to predict psychological statuses to see whether or not somebody fits well into a team and how they will perform in a particular position. This is our goal.
We’re an early startup, so what we do now is collect data. That’s video data on the one hand, and psychological scales on the other; meaning labels that are generated by having people fill out a questionnaire that predicts these psychological scales and allows us to build a labelled dataset. This is not the only feature we want to use—another important feature is text, video is just a part of it.
Darja Gutnick: In addition to that, to give you the bigger picture: the vision of the company is not to replace the job interview, or to improve one specific step in the hiring process. But we always were heading towards using new technology to improve human collaboration, and in the future maybe also human-AI collaboration. So the vision is to understand what drives human behavior, which psychological key performance indicators (KPIs) underlie decision making, and therefore to be able to optimize while using data in real time. The first step towards that, because we’re a company, is to address a business need, and that’s why we started out with the case of hiring. That doesn’t mean we’ll only be focused on that forever. That’s why we also see video analysis—which has had a lot of attention in marketing, but has not yet been deployed in the human resources context, and we’re aware of the opportunity—as just one data source. At the moment, based on the state of current scientific research, it’s not possible to just measure somebody’s face and say “this is the best job for you, go for it.”
But of course, the more data points you have—not only what you say in your CV or how you write your e-mails, but also your personality structure, your value orientation, and how much that aligns with a given context—the better it is for predicting how long somebody will stay on board and how he’ll get along with the people there. So instead of saying, “we offer automated video interviews and you don’t need to ask any questions any more, we just measure emotions and then we magically create this predictor,” we incorporate as much data as possible from different sources to get the best predictions possible for how long somebody stays in a team, and their performance and effectiveness and impact. We believe that with the right people you can achieve much more—and “the right people” isn’t about whether you’re right or wrong in general, it’s the combination of actors that creates great outcomes from collaboration.
The idea is to create a “data heart” for each company that allows you to not only improve decision making, but just gives you a much better opportunity to create a culture that works for the people inside of it, and also to really utilize culture as a business asset. We all know about culture, right? There are so many perspectives on culture. But it was really hard to actually put a finger or on it and say what it means—what does “cultural fit” mean? Does it mean if I speak with eight people, and they all say they like me, that I’ve been successful?
Our first step to answering these questions was to look into existing research, and we partnered with a researcher at Stanford University trying to crack the same equation: what makes a culture a contributor to a company’s success? There’s a lot of research out there that’s been compiled over the last 100 years about what makes successful collaboration. One of the factors is norm alignment: a common understanding of what’s important in a given context. For example, if you agree on a deadline, you need to actually make sure you can submit the work on time, and if you can’t make it, you need to communicate clearly when you’re going to submit it, and people need to be able to rely on what you say—this is a norm. When we look at successful teams, they each have a shared set of norms. They are not all the same, this does not mean a specific set of norms makes you successful, but it does mean you have to be aligned on the understanding of what’s expected. That’s one aspect that can be brought out and measured with data, and where people who don’t fit that set of norms might fit better in a different team, and therefore create more impact.
Wallace: Different companies want to hire different kinds of people. How do you account for that?
Weber: The whole idea is to find somebody who fits into a particular team. If a team has a particular set of norms or behaviors that they adhere to, it works much better if you hire somebody who adheres to them too. There are some traits that are probably always desirable. For example, you would assume that somebody who’s empathetic works better with others than somebody who is not and you would assume that somebody with motivation works better than somebody with low motivation, but in general it’s the values that are important.
What we do is measure the teams first. Everybody in the team goes through the same process as the interviewee, and from that we calculate how well a candidate fits in there. So when you think about hiring preferences, it’s not so much a preference somebody would express, but something we measure in advance.
Gutnick: However, we did find from customer feedback that they want to have control over the process and they want to be able to influence the priorities they’re looking for. So in the current iteration that we’re testing, we introduced the “shaping culture” feature, which is welcomed by team leads in particular, because it gives them the sense that culture isn’t just something that’s there, but something that can be influenced, and for which you can set targets. We have norms like customer orientation, people orientation, results orientation, and flexibility, or how well people adapt to change. You get a different profile for each team, but that doesn’t mean they have to stay the same: we give the option to adjust priorities, and possible ways to achieve targets, and to select candidates that not only align well with the team, but who also help get the team closer to the desired goal. It’s a balance between educating them about research-based insights and what makes them successful, and keeping the power with the team and the team lead.
We can’t, and don’t want to, drive somebody else’s culture for them. We provide a tool to quantify culture aspects in each team and help them make decisions. It’s not just a fixing tool, it’s a tool for awareness. It’s kind of like how a FitBit won’t make you run 200 km per week, but it helps you realize how small efforts can get you closer to your goal.
Wallace: Some job candidates might find the idea of having their facial expressions analyzed by an algorithm a little unsettling. How do you deal with this?
Gutnick: We’ve thought a lot about this. There is this barrier, and it is tricky. Our earliest tests revealed that in our target group of tech-savvy people from knowledge-intensive startups in Germany, about half the people were weirded-out by it. So what we decided to do was offer it as a voluntary option. It’s only one source of information: there’s also the self-report and the quiz. In any case, everything we collect from the candidate is controlled by the candidate, in their private profile. It’s not something that just goes to the company, the customer gets it first, and gives consent for the company to see how they compare. So there’s no obligation to do a video interview if you don’t feel comfortable.
But we also found that the other 50 percent of people don’t mind it because of the value it gives them. If you get something out of it that really helps you to determine your stats, something about yourself, why some people reject you, why others invite you—you might begin to think about trading data against that knowledge. And if you decide to do so, and you’re fully aware of it, we’re fine with that, and that happens in a lot of cases. So we give feedback to each and every candidate in an automated way, and then in about one-in-five cases they call us for additional feedback. So I think the way we overcome this problem is by giving value to the candidate
We did similar tests in the United States, and the numbers there were quite different. Fewer people were bothered by it. The adoption of video interviews is much higher than in Germany and the EU, and I would assume that the different numbers were a result of adaptation to that. If you give enough information and you’re transparent about what’s being measured, why, and what for, and how it looks and how you can work on the results, I don’t think people will be afraid of that forever. This is a useful technology.
Wallace: Some people have concern about algorithmic bias. But human bias has been with us longer than algorithms, and we aren’t always aware of our own biases, let alone of what’s going on in other people’s heads. How does artificial intelligence influence the problem of bias in job interviews?
Weber: Our first goal is to reduce the bias in hiring. And by this, I mean human bias. Say you’re hiring a nurse, and you get a man—at least in Germany, that doesn’t fit the typical view of a nurse. You’re hiring a developer, and in comes a woman, maybe an older woman, rather than a man between 25 and 35, which is what you were expecting. I’m not saying people are sexist, but these perceptions are based on what’s dominant in the market, they shape a person’s view of these positions. And you can’t help it!
The idea of these personality assessments is that the person is stripped of that kind of information, you only see how well they fit in your company. That’s without getting to know them, without seeing them, without knowing which school they went to, whether you share the same hobbies, whether they have children that are the same age as yours, whatever—removing this kind of thing helps to remove the bias.
Gutnick: But there is bias in algorithms, and there is not a simple answer to it. There are different ways to battle it, and different things you can build into your prediction models in order to reduce it or to control it.
Weber: Yeah—there’s different parts to the software. First of all, there’s the facial recognition and the landmarks. As I said before, we’re using third-party software there, and these landmarks are trained on a really huge dataset, and I wouldn’t assume there was a lot of bias there—because the bias depends on the dataset you have. The bias might come at a point where humans start putting labels on things. But we only use labels for emotional responses, so right now we’re optimistic that the algorithms won’t suffer too much from bias in the psychological profiles.
But things look different when you consider that we also want to use the time a person spends in a company, or whether somebody’s hired or not, to improve our model of the fit in a company. Because at this point, we start taking information from people who might be biased. Somebody might have been a good fit, but the fact things didn’t work out could be the result of bias in the team. The only way to fight this issue is to analyze the data rigorously, and if push comes to shove, adjust it manually.
Gutnick: And if you uncover a connection between, say, “female” and “nurse,” or “female” and “kindergarten teacher,” you can manually account for this. The first step is to analyze the data to recognize patterns, and to evaluate whether these patterns are what we want them to be, and whether they’re biases rather than actual predictors of success. The way to go here is to analyze and correct. There isn’t a magic sauce for de-biasing, you always need to gather your results first and ask whether they might be subject to bias. That’s what you always do with statistical analysis: I was trained with the sentence, “never believe any data you haven’t manipulated yourself.” Statistics is tricky, you can never just believe your results right away, you need to analyze it and the patterns that underlie it, and for us it was crucial to clean datasets if we saw biased patterns that were influencing our results. It’s a problem that’s existed for as long as we’ve done correlations and regression analysis.
Wallace: Why would a company be interested in an AI-based approach to hiring? Do you have any data on how this helps to better match employers with the right employees?
Gutnick: There are two value propositions. One is efficiency: you save time. If you are a knowledge-intensive company, which you are if you’re a tech company, or anything related to that, you go through about eight-to-ten rounds of interviews on average in order to determine your next hire, on top of all the other related activities. So you screen CVs, there’s this huge amount of interview rounds that help you to determine the cultural fit—we cut all of that out, and help you to focus on the top 10 percent of well-fitting candidates with whom you can go through a more elaborate process.
But because you only apply that process to a small proportion of applicants, you save up to 75 percent of the time. And currently we are running tests with clients where that’s exactly the outcome. And that’s the only way we can sell: the market is full of HR tech solutions, our potential customers are bombarded with different options, so we have to be very rigorous and clear in the value proposition. So we let them try the product, compare filling one vacancy with the product, and filling another without it, and they see that they save significant amounts of time when using us.
The second is basically to leverage your team’s potential to help your team perform better. That’s much less tangible and more fuzzy, so I can’t give you a clear number like 75 percent, but we’re working on that. What we need to get with this value proposition is to be able to compare: if you use our product for learning and development, how does this influence your bottom line? What’s the outcome? So we’re one step further with proving the hiring, we have the proof there with our test clients, and we just need to make the growth and development approach more visible.