Join me, Dr. Charles Handler, for the latest episode of Science 4-Hire as I engage in a fascinating conversation with Neil Morelli, lead IO psychologist at Codility, a global provider of coding assessments.
Our discussion centers around LLMs, such as GPT, and the dangers they present to test security and honesty when taking assessments. We agree there can be no bad without good, so we also talk about the positive side of LLMs and their role in helping us be more productive while evolving the nature of work itself.
Join us as we explore both sides of the “GPT good or bad?” debate.
We begin by exchanging ideas about the transformative nature of GPT in general. We both agree that it is a significant paradigm shift that seems to have come out of nowhere.
We then get into a really important conversation about the various use cases of GPT in the world of pre-hire talent assessment. We discuss both positive use cases and negative ones such as cheating.
According to Morelli, “cheating” should be defined as:
“a no or low knowledge candidate who wouldn’t be successful otherwise is now using this to basically impute knowledge that they don’t have and signal that they’re qualified for a job that they’re not qualified for.”
But there are many use cases where it is perfectly normal for IT candidates to use tools to support their efforts. For instance, developers use outside tools such as Stack Overflow on a regular basis and this type of outside assistance is 100% accepted. So how is using GPT to assist with writing code any different? According to Morelli,
“It’s a supercharged version of the knowledge repositories that most developers rely on every day to do their jobs.”
We agree that, when it comes to talent assessment tools outside the realm of coding (i.e., personality, cognitive, simulations, etc.), the format of the questions themselves make it much harder to use GPT to cheat. Self-report tools use item formats that are more difficult to feed into GPT and they are often more subjective in nature, creating an entirely different paradigm from coding simulations.
Our conversation then arrives at the idea that solutions should focus more on how we weave GPT into the assessment tools we are using or building instead of simply assuming it is a tool for cheating.
We arrive at shared speculation around the exact role of GPT in shaping the future of hiring assessments. Can it be a force for good, or will it be our downfall?
Neil notes:
“So, I can see the worry. I can see the anxiety that, hey, this is forcing assessment creators, writers, people that are in this business or do this type of work. It’s forcing us to adapt very, very quickly. And that level of change, the pace of change can feel super overwhelming and scary.”
We finish out our discussion with a focus on the idea that LLMs alone will not get our work done for us. Its effective use still requires human intelligence and oversight to both ask the right questions and manage the output to support our end goal. Cheating is no exception here. Those who rely soley on LLMs to do their work for them or to help them be someone they are not, will likely fail to meet their end goal of putting one over on a potential employer.
To quote the singer Rick Springfield, “We all need the human touch!”
Join me, Dr. Charles Handler, and Neil Morelli on this thought-provoking journey as we explore the potential of LLMs to impact the safety and security of talent assessment, now and in the future.