Hey You Got Your Linguistics in My IO Psychology!

Featuring: Karin Golde, Ph.D.- Founder of West Valley AI

Linguistics and IO psychology go together like chocolate and peanut butter!

“When you’re a linguist, especially a syntactician, you look at language a little differently. You see the structures under it, and have this x-ray vision of what’s going on under the hood.”

– Karin Golde, Founder of West Valley AI.

In this awesome and inspiring episode of Science 4-Hire, I have an enlightening conversation with Karin Golde, Ph.D, founder of West Valley AI and a Silicon Valley AI veteran.   Karin does an amazing job of laying down a solid foundation for the conversation with this truism: 

 “I think is one of the most tricky parts about AI is dealing with language data.”

From here Karin, with a background in linguistics and extensive experience in leading AI data teams, delves into the evolution of AI bias, and Linguistics, implications, and ethical considerations in data sourcing and model training.  From this foundation the discussion orbits around the intriguing intersection of artificial intelligence and industrial-organizational psychology, particularly focusing on high-tech use cases related to AI and hiring.   

Key Takeaways:

AI Evolution and Language Data: Karin illustrates her journey from studying theoretical syntax and semantics to leading AI data teams. She emphasizes the tricky parts of dealing with language data in AI and how the evolution of AI has shifted from understanding the structure of language to more sophisticated generative AI models.

Ethical Considerations in AI: A significant part of the discussion revolves around the ethical concerns related to the sourcing and labeling of data for AI and AI bias. Karin sheds light on the often overlooked human labor aspect, highlighting issues like low-wage labor, the quality of data, and the potential psychological impact on workers involved in tasks like toxicity classification.

AI and IO Psychology Intersection: The conversation explores how AI can be leveraged in IO psychology, particularly in deriving insights from text and evaluating qualitative data. Karin suggests the potential of AI to scale and refine the process, though with an emphasis on the need for expert oversight to ensure quality and ethics.

Future of AI in Hiring and Talent Assessment: The episode touches upon the future possibilities and challenges of integrating AI more deeply into hiring processes and talent assessments. Discussions include the potential for creating more efficient systems while also acknowledging the need for robust ethical frameworks and data quality standards.

Visit West Valley AI’s website westvalley.ai to learn more about Karin Golde’s work.

Reach out to Karin on LinkedIn or via email at karin@westvalley.ai for wisdom on AI applications in HR tech or people analytics.

And as alway, please check out our You Tube Channel for a video of this episode and other awesome content.

Full episode transcript:

Speaker 0: Welcome to Science for Hyre. With your host doctor Charles Handler. Science for Hire provides thirty minutes of enlightenment on best practices and news from the front lines of the improvement testing universe.

Speaker 1: Hello, and welcome to the latest edition of science or high I am your host doctor Charles Handler, and I have an amazing guest here today whom I’ve actually known for a pretty long time but never really connected on professionally and technically. So I’ll talk about that a little bit. But I am going to go ahead and let Karin Goldie who is the, I guess, founder of West Valley AI. Right? And longtime veteran of Silicon Valley AI stuff, and I’ll let her talk about that because she’s got a pretty interesting background in, I believe, in linguistics.
Right?

Speaker 2: Yeah. I I did recently found West Valley AI as a consulting firm for organizations who are building road maps or planning implementations of AI and especially setting up strategies for language data, which I think is one of the most tricky parts about AI is dealing with language data. And like you said, it’s something I’ve had a lot of experience with. I got my PhD in linguistics way back when, and I’ve been working at various startups leading up teams of data scientists and computational linguist to build enterprise software. And then more recently, I was at Amazon.
I was on the AI data team, was leading a team of what we call language engineers there to to prepare the the data needed for training, the kinds of services that AWS has for human language technologies like transcription, translation, chatbots, things like that. So Yeah. Yeah. No, I’m excited to be here. I’m also a longtime fan of Biopsychology, so I kinda kinda married into the field.
And

Speaker 1: Literally. Literally.

Speaker 2: Yeah. Yeah. So, you know, honestly, like, over the years as I’ve led teams, I’ve always gone to my AIO friends for advice on, you know, proper hiring and assessment and management techniques. So Yeah. So really, really excited to have a conversation about the the intersection of those two things.

Speaker 1: Yeah. For sure. So before we do that, I mean, let’s let’s Personalize it a little. I mean, I’ve known your husband, Mark, for a long time. We’ve worked together.
Craig Gaye, I owe, psychologist, educator now, you know, and And so it’s funny because I’ve known you and we’ve hung out, but I never I I knew you are a PhD linguistics. Right? I believe you’re, like, fluent in Japanese. Right?

Speaker 2: No. No. No. No. Now the rumors are really flying.

Speaker 1: No. Oh, no.

Speaker 2: I was in East Asian studies minor, and I I did take three years of Japanese, but far from fluent.

Speaker 1: Yeah. Oh, you’re not supposed to say that. You’re supposed to say Kanishwa.

Speaker 2: Yeah. Yeah. Kanishwa. It’s you know, actually, I will say though that, like, setting Japanese for so long, you know, about a study, German, a lot of I do speak German, like, fairly well. And, you know, and you just if other languages, it is important when you’re a linguist to really have tried to get a feel for a number of languages even if you can’t really speak them.
Just get a sense of the diversity. Not it’s funny because AI is is very is a very English centric field Mhmm. Maybe, like, secondarily Chinese, but a lot of the kind of techniques that have been built up over the years kind of start with English as a starting point and then kind of branch out into other languages, but it’s it’s, you know, it it’s better to think about language more more holistically before you. It did things. Yeah.

Speaker 1: And, you know, I never really even connected, you know, oh, well, artificial intelligence or even data science, whatever, linguistics. I never really connected those two, but now it’s it’s all about that. And maybe when you started in your field, even though machine learning and those kind of things have been around. I don’t know. Were you studying this kind of stuff in a more embryonic state?
Back then or were you looking at completely different things related to language?

Speaker 2: Yeah. Well, it’s actually, this is kind of a good intro, I think, like, the evolution of AI. Right? It’s sort

Speaker 1: of Right.

Speaker 2: Right. When I was in grad school, I studied theoretical syntax and semantics. And that is the the field of of how, you know, what is the underlying structure of language and how do those structures support meaning. When when you’re a linguist, especially a a syntactician, when you look at language, it’s kind of different from how other people look at language. You kind of like see the structures under it, kind of have this x-ray vision, right, of of what’s going on under the hood.
But at at this you know, but it is it is very serious, and there’s there’s lots of different interesting phenomena to to discover. But I will say, like, in in in grad school, like I said theoretical, but still, you know, the the kind of models that we were working with were were computationally implementable. So my my first job out of grad school was at a company that was implementing a parser in the same framework that I had been studying. So basically, this, you know, this machine system AI system, I guess, you could you could say, we didn’t call it that back then because it wasn’t in fashion at the time. That, you know, would take a sentence and then just output what the structure of that sentence was, how all the different, you know, components broke down and related to each other.
So at that time, AI was really more about trying to understand the structure of language similar to how maybe humans do and do more of an expert system type approach of saying, like, okay. These are the patterns that map on to this kind of output that we want. And in this case, we were doing automated email responses. So given the way that this, you know, email kind of breaks down at the syntactic and semantic level, we want to map this on to you know, a certain action that the customer service rep needs to take, you know, whether it’s tracking an order or changing a password or whatever. So so, yeah, so that was kind of like the a a lot of a lot of AI which stills around, still exists.
A lot of rules based expert systems are are still kind of either components or, you know, even main components of systems today. But that was that was the transition.

Speaker 1: Well, lay people, you know, might just think linguistics means foreign languages. Right? Like Mhmm. It’s interesting because that’s kinda how I thought about it even until I learned a little bit more. So I’m very used to and I probably carry this to other places.
Even like when we’ve been hanging out in the past. Here in New Orleans where I’ve been for a while, one of the most interesting things is that people that you meet, they they don’t ask you what you do. Mhmm. There’s people I know that I have no clue what they do, and I rarely get asked. And when I do say, oh oh, I’m a a psychologist or organizational psychologist, you can imagine the first thing people say, oh, you’re gonna analyze me or oh, I’m too crazy.
Don’t analyze me or just assuming that I’m a clinical psychologist and that’s what I do, which I I understand, you know. So so that’s interesting. So I don’t do that a lot in other situations, and it was just recently I was out in the Bay Area. I went out there to to see one of my favorite bands, Divo, with your husband and your son, is Yep. I think his first real legit concert, which is amazing.
And that was just an amazing experience, I’ll say, like, I felt like watch in that band, go through, like, most of their you know, well, they have a huge catalog, but most of the things you you know I just wanna stop and say, like, how much I really feel what their message is profound in that, like, they are basically saying how I interpret it. Everything is bonkers and it’s getting more bonkers and we’re just devolving into just insanity. But at the same time, it’s beautiful and we have to embrace it and just make the best out of it even though it’s you know, even though we cannot make sense out of what’s happening very much. It’s our reality and and we gotta love it and enjoy it and just maximize our experience with it. I think that’s a good, especially as things are getting you know, walk here and walk here and people are behaving more strangely.
So anyway, that’s an implied, but

Speaker 2: Well, yeah. Actually, I I will say, like, Demo kind of crafted that message in the nineteen seventies. Right? And Yes. Somehow it’s remained relevant, and I think it was probably relevant before their time too.
So No. When you’re it’s easy to look back on history and and think, like, well, that all kinda made sense. And it sort of worked out. But, yeah, when we’re in the thick of things like AI right now, it’s overwhelming. I agree.
And we just have to make the best of it.

Speaker 1: Yeah. Well, anyway, so we got a chance to hang out and, you know, you’ve you’ve now started kind of your own journey and I’ve been doing that in various ways for a long time, so it’s good to be able to compare notes and because we’re somewhat similar situation, and we’ve been doing that a lot. But but really just starting to have the dialogue about what’s happening now in the world of artificial intelligence, computer science, whatever. It’s beyond exciting for me. Like, I really enjoy it.
So we had some good conversations. I learned a lot, and that’s really my my agenda here on the show too is to learn. So as we’ve been talking, you know, we’re coming from different backgrounds. I think we both understand one another’s you know, disciplines. And, you know, we we basically are are, you know, looking to extend our knowledge and do new things.
Right? I feel like there’s kind of a convergence as we talk about it and you’re you’re interested in I o topics and applications, and I’m interested in applying artificial intelligence to our discipline. It’s really the multidisciplinary approach that I think we’re really going to need or is actually happening. To be able to help these technologies work within industries, within scenarios, whatever it is. Right?
Because because one of us alone isn’t gonna be able to pull it off. And I guess that’s probably been the case for a long time, but it it seems, you know, magnified now and as we start talking about things, you know, we’re starting to converge converge on certain points, which we’ll talk about in a little while. But I think it’s good to back up and I’ve seen some really great presentations and materials you’ve done on large language models or AI for dummies or or, you know, just kind of simplifying things in an in an eloquent and concise way. And I wonder if you can just talk a little bit about for me, it’s about look, it’s all about large language models and, you know, the GPTs and and all that good stuff. So lead us to where we are now, maybe a little bit, in terms of the world of artificial intelligence, so we kinda and personalized that with what you’re doing around data sets, because I think that’s that’s what that’s the molecules I would say of this stuff.
Right? The atoms and actually, the atoms more than the molecules. And so that’s what we’ve got to start with. The quality of everything we do depends on how our data is managed. Right?
So

Speaker 2: Yeah.

Speaker 1: There you go.

Speaker 2: Yeah.

Speaker 1: How about it?

Speaker 2: Sure. Sure thing. Yeah. So, you know, I I think it is helpful to, like, talk about a couple different kinds of AI first just to situate the the conversation and 90s, early two thousands were were really still about these expert systems, especially when it comes to, you know, actual production systems, you know, in enterprise software. And the the next phase really was more around I think especially in the kinds of of disciplines we’re talking about is it was around supervised machine learning, which means giving us a specific kind of algorithm, a bunch of examples of what it’s supposed to do, you know, identify the cats versus the dogs, identify positive negative real sentiment, you know, kind of do do these sort of classification tasks and other related tasks, but that’s kind of what a lot of it boils down to far as data sets go, you can imagine to give it all those kinds of examples of of how you want your system to work, you need some kind of human judgment Right?
Some kind of human in the loop to create that data for the training and then also to do the end evaluation and, you know, iterate on improvements and so forth. So you can kind of think of that that phase, which we’re still very much in. And and none of these phases goes away. Right? They just all kind of build on each other and create more and more complex systems.
So these, you know, machine learning systems were, you know, generally, you know, combined with with expert systems in in some way, you know, And now, now we’re kind of getting into this, the generative AI phase, which is where, you know, on the surface, it feels a lot more magical. So rather than

Speaker 1: Right.

Speaker 2: You know, and I will say, like, AI always refers to what seems magical now. Like, nice.

Speaker 1: I like

Speaker 2: That’s the that’s my definite of AI.

Speaker 1: I love it. What seems magical now? Because because GPT seems ephon magical. Every time I use it, I’m just blown away and I’m using it for so many different things and I’ll say it a million times to whoever listened. Right?
Like, we’ve had all kinds of fattish things. Even like three or four years ago, people were talking about AI, a member going to HR Tech, every vendor had, you know, AI plastered on their stuff, this was probably just machine learning natural language processing.

Speaker 2: Okay. Well, hold on now, Charles. What do you mean just machine

Speaker 1: learning and I love that in my head. No. I just bought it in my head. It’s not just it’s just it’s You

Speaker 2: all get bored so easily.

Speaker 1: I almost correct. It’s not shaded. I know. I know. Right?
So it is a fantastical thing that’s done amazing things. But Thank you. But and then you’ve got, like, I don’t know. Blockchain, metaverse, you know, these are all valuable things, but they have they have not had the profound impact on changing things that that this has. And it’s always the fad and the hype, and so I hate it.
I’m not a hype guy. But when I but I find something I believe in. I hype it. And I’m hype in this stuff and nobody even notices because everybody is. But anyway,

Speaker 2: Yeah. Yeah. I think you you get more luck being noticed if you’re you’re on the other side of of the hype.

Speaker 1: Yeah. Yeah. Yeah.

Speaker 2: But But, yeah, I mean, the the reason it seems magical is because, I mean, first of all, it’s these are, like, general general purpose, you know, generative systems. And, of course, there’s there’s image and now video as well. I’m I’m kind of, you know, kind of gonna be centered on on language data in this conversation, but some of what I’m saying is is relevant to two other things.

Speaker 1: Sure. Sure.

Speaker 2: Sure. But yeah. It’s, you know, it’s just trained. Like, probably people know few of the basics. Right?
It’s trained on a huge corpus, which is largely scraped from the Internet.

Speaker 1: The Internet. Yes.

Speaker 2: Also, you know, like books and other kind of just digitized text. Right? And and from that, it just learns to predict the next word. Right? So it’s a it’s a probability engine where whatever the next word probable word is, that’s what what it’s gonna output.
So that’s what, you know, large language models are in a nutshell. Now that’s how they’re that’s kind of just the original pre training. It’s what I’m describing though. And I think what a lot of people don’t realize that there’s a huge next stage. To make that actually useful, you have to actually do a lot more training.
And that means feeding in examples. In the case of a a system like chat GBT, it means feeding in a lot of examples of what does a question look like and what is a good answer to that question look like? Thousands of those examples. Right? At least across yeah.
Yeah. I mean, across all kinds of subcategories and of types of questions. There’s a whole taxonomy of the kinds of questions that you expect these kinds of chat bots to be able to handle. So there’s all that kind of fine tuning, and then there’s a second stage even where where humans have to kind of review and rank the output of the model because the model since it’s probabilistic can do multiple outputs. So you kind of train it, which is what so you’re constantly sort of nudging the weights in a way that the next predicted word ends up being something that is helpful, harmless, and truthful is kind of the triumph of of what how these things are trained.
So I think, you know, this is where this is the kind of stuff I’ve worked on right earlier at Amazon. It was you know, my team was doing a a lot of the data for large language models. And and, of course, a lot of the datasets for other kinds of, like, supervised models as well. And I I think what this is what’s led to a really strong interest of mine in the role of human judgment in and AI creation because it doesn’t just magically know this stuff. This humans have to have to do put all kinds of work into this.
So and and I I will just kind of, like, take this opportunity to say we have an ethical data supply chain problem in AI. Where we have gotten used to kind of getting very, very cheap labor to do this kind of work. And there’s not a lot of labor protections for the kind of people who do this. They either do it as gig workers, logging onto a platform or they do it as contractors working for a vendor that the tech company engages with. But either way, you know, there’s there’s a

Speaker 1: let’s talk about this real quickly because I haven’t ever thought about that. Right? So Mhmm. The immediate thing that comes to my mind is is kind of a parallel. And then we could talk about, like, what does this actually mean?
Because I don’t think people focus on this a lot. But if you think about lithium. Right? Lithium’s used in a lot of EV batteries and stuff. I don’t know how much.
And there’s a really cool thing. I don’t know if you’ve ever read the Atlas of AI. I can’t

Speaker 2: remember that. That’s a great book. Yes.

Speaker 1: Yeah. Yeah. So I think I can’t remember the author’s name, but she talks about lithium in there. Mhmm. And, you know, the it’s dirty.
I mean, you’re using it for clean energy, but you’ve got people in in Africa like just destroying the environment on an individual basis and then selling the lithium to a middleman who sells it to the big corporations. So the lithium is clean. Nobody really knows where it comes from. It’s used to make clean things, but it’s absolutely decimating the area where you’re mining it. So and people don’t think about that when they’re driving around in their EVs or using their power drills or or whatever, taking their bipolar medicine, you know, whatever that is.
But I feel like maybe we’ve got the same problem in AI. I never thought about that. And even not only that, I don’t think people stop and really consider at all. Where the hell does this data come from? Even me, say, oh, it’s trained on a corpus.
We train it on the Internet. Well, I don’t I don’t see in my mind a human being involved in that. I just see some technology being told, go get this stuff. Mhmm. So tell us a little bit more about the very foundational level where in the hell does the day that this thing trained on or these things trained on come from?
In the large language model sense, in the more open

Speaker 2: Mhmm.

Speaker 1: Sense, you know?

Speaker 2: Yeah. Yeah. And I think your lithium example is is good, you know. And and it is one of these things where we tend to kind of divert attention away from, you know, some some difficult truths about this. But, yeah, I would say as far as where the data comes from, you know, I mean, obviously, like Google has indexed the Internet of large portions of the Internet for a long time.
It’s it’s kind of accepted that to a certain degree what’s on the Internet is is free game for certain types of fair use. Although even Google has kind of gotten some flack over the years for putting too much information straight on Google dot com so people don’t click through. But, you know, this kind of takes it to another level that I think has made some people very uncomfortable at scale. It feels like it’s kind of piggybacking on a lot of work that you know, people did and are are not compensated for, especially if it’s professional work, professional writing, you know, news journalists and things like that. So, you know, I I think that that is gonna be, obviously, a lot of a lot of lawyers talking about it right now, and I I don’t have any, like, kind of legal opinion on it.
But I think it’s kind of an interesting ethical question that I don’t have an answer for. You know, is is it okay or is it not okay? I think the second thing to consider though is, what does this data represent? Because we tend to think of the Internet as being like, oh, Well, that’s like a mirror of the world. Right?
Like, if anything happens in real life, gets reflected on the Internet, and so that’s kind of a good proxy for who we are as humans. And

Speaker 1: Oh, ouch.

Speaker 2: Yeah. But it’s not really. Right? Because, I mean, for one thing, like I said, that the AI has always been fairly English centric. The Internet is fairly English centric with, you know, a pretty you know, large groups of other languages kind of following behind.
But there it definitely does not it’s not proportional to who who we are as a species.

Speaker 1: Yeah. Yeah.

Speaker 2: Globally speaking. Right? And and so that that’s that’s kinda one issue is that it’s not really representative of of us as a whole. So people are going to kind of receive differential benefits from it.

Speaker 1: Mhmm.

Speaker 2: Another thing is that, like, as we No. The Internet tends to bring out the worst in us in a lot of ways.

Speaker 1: Yeah.

Speaker 2: And so, you know, some of the when when these models get trained, there is some attempt made to kind of, you know, clean out the the drags of the Internet, I guess, the the four channels of the world. But, you know, we’re all we’re all the hate speeches and all the Right.

Speaker 1: Right. Right.

Speaker 2: But it’s not really practical to do that. And and even if you get kind of the the worst of it out there, there’s still really interesting kind of biases that that end up, you know, that the data is skewed in certain ways. So yeah. So that’s kinda like, you know, the the pretraining data which just, you know, doesn’t involve a whole lot of human labor, to be honest. Like I said, some attempts at cleaning, but it’s not usually where the focus is.

Speaker 1: So Gotcha. Gotcha.

Speaker 2: So that’s kind of the the first big stage.

Speaker 1: Right. Okay. So you’re you’re leading me to knowledge, but first I wanna say when you’re talking about that, the image I had in my mind, my very favorite New Yorker cartoon of all time, which I just looked at something that ranked all the you know, the the most liked or most. And it and this was one of them. So it’s not just me, but it into this one’s like, twenty years old, but it’s a dog sitting in front of a computer.
Have you seen that one?

Speaker 2: Oh, on the Internet, nobody knows you’re a dog?

Speaker 1: Yes. Exactly.

Speaker 2: Yeah.

Speaker 1: I love that. And did you know

Speaker 2: the classic?

Speaker 1: Yeah. I submitted a cartoon to New Yorker. I think you you can’t really, you know, expect to get it Done. But I found a guy that does illustrations. Right?
So because I can’t draw it. So I I had him do it for, like, fifty bucks or something. I submitted it, but I think it’s great. I’ll I’ll give you the visuals. So and and and I was at my mom’s place, and there’s guest towels in there.
And and, you know, it’s my old bathroom when I was a kid. I’m like, nobody’s probably touched this guest towel in twenty years, you know. So my thing is it’s it’s when guest towels dream and it’s a guest towel sitting on the bar, but it’s got a fantasy you know, a bubble and he’s a beach towel with a like, he’s on the beach with a lady in a bikini laying on there holding a drink under an umbrella. Right? Because Beach towels, like, they’re not even they’re not even achieving their purpose of actually, like, drying anybody or doing anything.
And there’s other towels that are out there having all kinds of fun in the world. So I

Speaker 2: I do like that, Charles. Yes.

Speaker 1: They didn’t they didn’t accept it. As far as I know, But but I’m gonna get this year for for the holidays, I’m getting copies of it for for all my relatives to go on their wall. So at least somebody will see it. Anyway, get a digression, but so I’m getting down to the point in my notes here. I’m like, well, what about the low wage labor because if you’re just pulling this off the Internet, there’s no low wage labor.
So tell us about that next step of it sounds like these people are classifying and training things so that the model knows what relates to what kind of thing to Yeah. Tell us about that. And how that works?

Speaker 2: Yeah. So so, I mean, this has been going on for, you know, years as we’ve been building these predictive models. You need to have I so I’ll just give you an example for my own my own work history, which is that I was for a good ten years, I was a company called Netbase, which is now called Quid. Doing social media analytics and that involved a lot of sentiment classification. Mhmm.
Mhmm. So at a high level, a brand would, you know, be able to log on to our platform, see all, like, the social media posts and news posts about them, but then also see kinda aggregate about it because no one’s gonna wanna read through all of it. Right? So so one of the aggregate metrics was how do people feel about your brand? What is the what are the opinions towards your brand?
And in order to, you know, basically check how well our system was working and and get things, you know, oriented in the right direction, I got a lot of social media posts labeled for sentiment. Is this, you know, I drink Coke. Is this positive, negative, or neutral for the brand Coke? And and so forth, you know, just tens of thousands of these Right. That I did over the years.
It it was a matter of putting this stuff up on a platform. I used a platform called Appen. And on that platform, people would just see kind of like the survey. Right? Like a little, like, multiple choice question for for each of these things.
And they would get paid what I tried to make it work out to something like at least minimum US wage because I was really targeting US workers for this thing. But there was really no there was there was no floor. Like, there were there was no, like, minimum wage that I had to pay them on the platform. I could pay them, like, you know, as as little as I as I possibly could, and people would still do do the work. So it it wasn’t that I had to, you know, attract workers with a certain wage.
You could really you could really go rock bottom and they would still do the work.

Speaker 1: Where are they located? Are these Are these people in offshore, basically?

Speaker 2: I they’re all over the place, and you can there’s a lot in the in in India, in the Philippines, and east Africa, I was actually narrowing it down to, I think, US and Canada because I was I was looking for sentiment on brands, which is somewhat subjective and requires some cultural knowledge. Yeah. So I was really kind of But you you can, like, especially if it’s, like, you know, classifying images of cats and dogs. There are some things that are not that don’t really require cultural knowledge. You can outsource to to other places.
But yeah. I mean, this required things like having automated quality controls because, obviously, people built bots to try to scam these systems. People poll would, you know, you you could just, like, answer the same question and and, you know, the same Right. You could put positive on everything.

Speaker 1: And you know, it’s

Speaker 2: all natural.

Speaker 1: Christmas tree, everything.

Speaker 2: No. Polycom. Yeah. Yeah. Yeah.
So, you know, that that’s that’s obviously something that you have to have mechanisms to to deal with. And so on this particular platform, what you do is you put in. You have you have kind of a little training and assessment portion at the beginning, so they take a little test quiz as a as a gating factor. But then to make sure that they’re continuing to do good work. As they go along, you have on each page a hidden test question where you’ve already manually put in the correct answer.
And then if they get it wrong, it pops up, you know, you got this wrong thing. And it it lowers their score And if they fall below a certain threshold, which is usually like seventy or eighty percent accuracy on these test questions, then they just could get automatically kicked off

Speaker 1: Proctoring. Proctoring or or like slacker control, basically. Right?

Speaker 2: Yeah. Yeah. So that actually works pretty well if you create your test questions well. It turns out that it’s actually really hard to create really full proof test questions that aren’t also super easy. Because you do wanna make them a little tricky to make sure that people are paying attention to the finer points of the guidelines.
And and so you you have to constantly walk this line and you know, if you if you let people get things wrong, even though it was really a problem with your own test question, what happens is that their reputation scores go down on the platform. And then they’re not able to access, like, some higher paying jobs because people tend to get put in tears according to their their work quality.

Speaker 1: Interesting.

Speaker 2: Yeah. So you can pay more for people who have had, you know, higher reputation scores. So there’s all kinds of implications for their their own livelihoods, and there’s really no recourse. Right? They can send you a little message that says this wasn’t fair.
And I actually read those messages and responded to them, but I I suspect there a lot of people didn’t. And or or don’t. And there’s not really any requirements to do that. So it’s basically what you got is just this huge power imbalance. Right?
Where you’re you’re dealing with people who are anonymous to you. And therefore, you’re really even less incentivized to care about them and how much they’re getting paid about their experiences like And that goes for both these kind of this kind of platform that I’m describing and then also, you know, cases where you have vendors who are working with teams of contractors that they’ve engaged. You know, there’s kind of a tendency to throw things over the wall and just let the vendor deal with it. So I I will say that, you know, this is this has been coming to people’s attention more. There was a new story that got repeated quite a bit, I think, about the Kenyan workers last year who were working on open AI projects.
They were working actually for a vendor called SAMA. And so they they were doing toxicity classification, which is especially rough because It means it means looking at a whole bunch of text or or sometimes images, in this case, it was text. But it’s basically like a content moderation job where you’re like, you know, identifying graphic abuse and, you know, and and hate speech and, you know, just kind of a lot of disturbing growth stuff that falls into a variety of categories they were I think the the news stories tended to focus on the fact that they were being paid two dollars a day, which actually wasn’t really the issue. The issue was more that they were led to believe that they wouldn’t be working such long hours on this thing and that they would be getting psychological support for it, and those things didn’t materialize. So it it really it was very traumatic.
And, you know, the the people the the people who spoke out about it from the workforce said it was it was like torture doing this kind of work. And so the the they you know, OpenAI ended up canceling the contract early just because things went so sideways. But this is the kind of thing that can happen when you really don’t think about people as people, and you’re you’re kinda trying to avoid the reality that, you know, AI is really built off of you know, the the judgment and the the knowledge and intelligence of, you know, thousands and thousands of humans. Actually, I should say millions. There’s there’s millions.
And people do this kind of work.

Speaker 1: The first thing I think of is like, well, the pyramids were built by slaves of my people. And and this amazing, advanced thing is being built by almost that. And so, you know, We talked about earlier with lithium where you just said Kenya. I mean, that’s where the lithium mining is going on. So I’m sure there’s other places besides Kenya.
But again, starting to, for whatever it’s worth, make these connections. But really, what you’re talking about here, apart from the human rights issue, which is significant is garbage in, garbage out. Right? We’ve been talking about that for a long time. Right.
I don’t know. Chat gTP has been giving me some pretty amazing stuff to think it was built on garbage, but it but that’s not really what we’re talking about. So it it’s all about the quality of the data. Right? And that that’s, like, your department.
I think you’re you’re really putting up a stake in the ground saying, you know, we’ve we’ve got to to recognize and and and really steward this stuff in a way that it it’s gonna make everything that we’re that’s already really great, way better, way more accurate. Because the bias and the things that are problematic with these things not from the human rights standpoint, which I think it’s really great that you brought that up and talked about it because I honestly don’t think that a lot of people, you know, we just take it for granted. That the day that this this stuff just came from somewhere, from Silicon Valley, from some smart people, they just put it all together, and it it’s not like that. At all. But let’s transition a little bit.
I mean, we we started talking about kinda some commonalities. We had a conversation where I was telling you oh, yeah. I’m doing my first like, I’m taking some prompt engineering, an online prompt engineering class. And, you know, I wanna apply my first little project to some taking qualitative interviews that I’ve done and and looking for trends in those interviews using GPT. And you asked me, well, how do I do it before?
And I said, well, I just coded it manually. And then you were like, well, Why don’t you, you know, why don’t you build a training train a model to do that for you? It’d be a lot it’d be a lot more efficient for you. And I’m like, well, I only do it like this particular scenario, I’m only doing it once. I don’t know how to build models.
GPT is one of those. It’s the windows instead of Doss, you know, or the windows instead of punch cards or whatever that people don’t have to know. Like, when I started grad school. I was doing very poorly in statistics because I had to program SPSS into the mainframe. And then halfway through the semester, the the first GUI, SPSS came out and I started getting a’s because I don’t have to worry about programming.
Right? So because they’re not the same skill set. That’s in in but I was caught in a morass of inadequacy in terms of being a programmer. So it it’s that kind of a kind of a thing. Right?
And so I just want the easy. I wanna press the button and, you know, that’s where we’re headed with all this stuff is agents that combine a bunch of different bots to do something for you. Not Alexa turn on the lights, but, you know, Alexa hire me a database engineer to work in backupsy or or whatever. Right? So I guess what I’m getting at is, I’m a simpleton in this stuff.
You’ve got some a lot of tools and experience. And so we started talking about, well, what would it look like, Charles, if you wanted to to build some models to do things. And I I just kinda got into what that’s all about and back to machine learning, which is really I think the more accessible I mean, I’m I’m guessing large language models are all running on machine. It’s all machine learning? Is it not?
Or am I starting to bastardize?

Speaker 2: Yeah. Yeah. It no. No. No.
You’re right. It’s it’s all it’s all machine learning. It’s Yeah. Just just different types of models.

Speaker 1: So we we come together on okay. Well, what what kind of things in I o psychology could these models do? What what’s feasible? You sent me an article which I hadn’t seen from my Campion. He’s actually been a guest on on the podcast and For PSYAP, I was a voter on the distinguished career award and and he was nominated and he was like a unanimous thing.
Like, he’s he’s a he’s a stud in our field for sure. It’s a really great article. I think it’s a personal psychology, a special addition. If you can get a hold of that, I’ll put a link in the in the show notes when when I’m done with that to the article. But we started to just talk about, well, what’s needed, what’s feasible, etcetera.
So I’m gonna I’m gonna let you talk now. Like, for someone like me, I what is someone like you able able to do? What could we do together to to help advance my field, I guess? Is that a too broad of a question. I don’t know.
I’m serving it up for you to say whatever you want really right here.

Speaker 2: Yeah. So what can what what can AI do for I o psychology? What can AI do for

Speaker 1: I o?

Speaker 2: Yeah. Yeah. The the campaign article, I it was really interesting to me too. I wanna give credit to to Cole Napier at an instructionally correct podcast because that’s where I

Speaker 1: wanna know

Speaker 2: about it.

Speaker 1: I’ve been a guest on that. I love it. It’s a Yeah.

Speaker 2: Yeah. Yeah. So I I think it it did a great job of breaking down all the different kinds of areas where AI is relevant or maybe not quite ready or not appropriate or or or, you know, all kinds of sort of analysis of that. And so I had a couple of takeaways that, from my perspective, were interesting. One is that they seem to think that there’s still a lot of value in just driving insights from text.
And and what I mean by that is you know, if you’re if you’re running some kind of data analysis, a lot of times you’re working with structured data, and then it’s relatively straightforward. But if you’re dealing with, you know, survey results, free text fields, it can be difficult to map that to something that you can use as like, basically, a feature, right, for your model. And so that’s what you’re doing is is is coding that out by hand. If that code, the the taxonomy of of things that you’re mapping to, changes every time, then that’s kind of you know, then, yeah, you aren’t gonna really it doesn’t make sense to build a a whole model for that. And so it’s kind of interesting to me that it does change so much.
Like, I would think that there would be some kind of, like, you know, standard set of labels or something that you would be working off of.

Speaker 1: Yeah. Well, I’m asking one, you know, it’s one survey with a specific set of questions that I probably won’t repeat again because it’s a research project, you know, to summarize what I owe psychologists in a in enterprise are are doing with technology or whatever. Right? If it were a job analysis interview Mhmm. Tell me about your job, what’s a typical day on the job, you know, what task you do, then for sure, it would be worth it.
I mean, there’s tons of types of interviews and things that that we do that that would be you know, useful. Well, I think for me, it’s really well, let’s just skip to what I feel like. Remember the concept, I think back in the early two thousands like the first Internet. Boom. You know, there was that term killer app.
Right? What’s the next killer app? The thing that’s gonna just you know, make everything else look silly, whatever. Well, to me, the killer app in our field has been scoring basically qualitative data, responses to questions, but even more like assessment center exercises, complex sets of information where an expert would have to be trained and used tools to synthesize that information and digest it and come up with a recommendation or score or, you know, something like that. That has been it takes a lot of time, a lot of energy to do.
Right? And so you can’t scale that. Well, to me, the biggest unlock for our field of hiring and coaching and develop whatever it is, about people as being able to take all that in in a artificial intelligence and come up with the same answer as a human writer who was well trained would do. Because that unlocks a lot of scale and a lot of new opportunities.

Speaker 2: Right. So theoretically yeah. I mean, theoretically, that that should be that should be possible. And potentially even straightforward, although I you know, once you look at the data, nothing straightforward, So I I think that if enough, you know, experts have already labeled enough data, then that’s great training and test data for a supervised machine learning model. So so, like, one place to start is is just there and and see see how well that works or how well you can generalize across different or, like, related use cases to really scale it up.
Now one thing though is that suppose you suppose actually what you’re describing breaks down in to, I don’t know, like, ten different use cases that really need, like, you know, ten different models created or twenty different models. But, you know, but it’s still kind of worth it because it’s gonna be repeated. The more data that you need labeled, you know, it it can be difficult because, like, the situations I was describing before, you’re outsourcing to people who may or may not have certain expertise. Now there are places that you can go, which will provide you with, you know, labelers who have particular expertise, you know, write down to, like, medical and legal and and everything. So it’s not that that doesn’t exist, but I think that there’s been interesting ways to kind of scale up this kind of labeling with LLMs so that you can say, here here’s a here’s a category.
Here’s a definition of that category. Tell me which of these survey results you know, fit in that category. That that can be a thing that you do. I think that it’s still not gonna do as well as a supervised model in a lot of cases or it might just be overkill because these things can be large and expensive. But if you can use that just to create the training data, just the training labels, you know, get your get your models going.
It can it can be a great way to bootstrap expertise. So kind of, like, you know, just kinda like adding a, you know, some rocket fuel to to expert opinion. Experts still need to go through and verify all of this though. If you just kind of take it from a prompt engineering perspective and say, I’m gonna take this very general model and ask it to do exactly what I’m telling it, you’re still gonna have to go back and evaluate how well it did. Right.
It’s just it’s not gonna be quite trustworthy enough because once again, it’s a probability engine. It’s not specifically trained to do what you’re trying to make it do.

Speaker 1: Mhmm. Mhmm. What you’re talking about is the crux of how we need to use this is an in it’s an interaction between the two with the humans really providing the the the last mile in some sense, the first mile and the last mile, but but so let me Let me give you a scenario, and I wanna talk about this scenario in the context of what we just been talking about. There have been some folks, and I think about it as a the level of a vendor. A vendor selling predictive hiring tools.
Right? If you think about situational judgment exercises, you know, things where where people are where they have a lot of choices and they make choices and even in like a gamified situation. Right? There’s a professional like a serious game. People are making choices and those choices have patterns and those there’s an optimal pattern in those choices to score you at the most perfect level.
But there’s other there’s other ways you can go and still score pretty well, then, of course, you can do terribly. And so in those situations, you know, what you’d have is human raiders rating this that exercise a bunch of times. Right? And they’re training. They’re they’re the trainers.
It’s not, you know, someone who’s an expert at or who can classify dogs and cats. It’s It’s expert people who’ve even probably written these things and then have, you know, kinda gone through and said, these are the optimal pathways. I hope I’m not, you know, bastardizing this or chopping, you know, hacking it. But then those experts train the system and then the system uses that training on the data from, you know, from other folks. Right?
And and so I think that we’ve seen that be pretty successful. I guess what we’re looking for is, well, can we just not have the can we just somehow not have those experts take the time and energy to do that? And can we just have some other system, you know, understand what the optimum pathways are through this maze essentially and score people on Right?

Speaker 2: Yeah. And I think, you know, you’re as far as, like, optimal pathways and things like that, you’re getting into kind of reinforcement learning algorithms maybe that I’m not as familiar with, so I can’t I can’t speak to it as well. I can just say that as far as, like, the to the extent that they’re judging language that as opposed to the the decisions that people made, you know, that you you can kind of you you can kind of bootstrapped again this again. Like, there there there are systems out there where you can for example, apply one label to a few different things, and it’ll suggest other things that we’re that should have that label as well. So it goes a lot faster and it starts to build up what’s called labeling functions.
So any any text that has this thing in it probably belongs to this category so you can quickly move through that as as a human reviewer. And that’s a lot faster than looking at, you know, number one. Number two. Number three, looking at the ball individually. You know, so I I kind of envision maybe those those systems like I said, the systems exist.
There’s companies like Watchful and Snorkel. That provide these what I think are super cool platforms, but maybe just need to be, like, repurposed a bit or made more accessible to people in more of a variety of situations. You know, like, like your every day, I O psychologists should be able to access this kind of thing.

Speaker 1: Yeah. Because some of us aren’t really good with the the programming side of things, but wanna achieve things. And that’s that’s just we’re just making things easier and easier. So that we can get on to the real human work that the artificial intelligence can’t do really well. So, like, taking the lot of the heavy lifting and burn some tasks away and freeing us up to do more enlightened things, essentially.
We’ve just kept continually doing that with technology. You know? At some point, the technology may turn around and enslave us to do what it wants. I don’t know. I feel like we kind of already are slightly in in some way, but that’s a whole another a whole another podcast.
So, I mean, it’s what I’m taking away from this too is, like, I don’t know, it’d be interesting as if there’s ever, like, some kind of a data quality seal. You know, like, we’ve really got to we’ve gotta gotta understand that data just doesn’t it’s not just falling out of the sky. It has a quality. It has an ethics behind it. At the very foundational level, you know.
I read something that the CEO of Indeed was saying, well, you know, unless we train and it’s not the exact same thing. But unless the people who are writing are code and are contributing to it our of a diverse background, that in itself, that context is important for for understanding that we we have code that is more we’re just basically saying the people who are doing all this should be diverse. If you want diversity out of the other end of it, there’s no empirical things I could point to for that. But I I guess what I’m saying is, there’s a quality in an ethics when we start talking about ethics of AI. Even the models we see, I haven’t seen one that says, well, where did your data come from?
Was it your training data, whatever? Like, was it coming from an ethical source? You know, kind of like single or whatever the the the stamps you get on your food that

Speaker 2: You’re prepared data.

Speaker 1: Data. Right? And you know.

Speaker 2: Mhmm. Yeah. No. There there are some, you know, efforts at this. There’s a there’s something called the data nutrition project, which is looking to standardize data labels like this.
But before that, there was data sheets for data sets, which was a paper, I believe, was written by Timothy Gabriel, who was famously fired from Google a few years ago and has gone on to to found her own research institute. Yeah. So there’s definitely some some movements there around how do we do better data governance? And I’m hoping to that, you know, it it was it’s funny because we’ve had these issues for a long time. Right?
Like, famously the facial recognition software that gets used by a lot of police right to identify suspects. It’s trained on mostly white faces and so black people are more likely to be misidentified and falsely arrested. This is something that’s happening today. And, you know, we probably should have had some kind of checks on that, you know, a long time ago. But I think the generative AI is kind of like raising the specter of of safety issues, you know, maybe in a more overblown own kind of way.
But hopefully, it will help us sort of walk back and and think about, you know, what are the just sort of like the things that are going wrong now that we should start start addressing. So, yeah, I think the and and, you know, maybe what’s, like, regulation. There’s, like, some, you know, specter of regulation. I know pretty toothless right now, but maybe that kind of thing will help people think more about like what it means to evaluate this data for the kinds of favors we we want the models to have.

Speaker 1: Yeah. I mean, we don’t have any of this if we don’t have data and the the the better the data is. Right? And I mean, that’s again coming back to a linguistics background. It’s it’s understanding how the like, how to label that data, how to make sure there’s semantic meaning and structure in that data that makes sense.
And you know, the more that’s optimized, the better the outcomes are. So, again, garbage in, garbage out.

Speaker 2: You know, I I also wanna say something I learned from campaign article that I probably is, like, really maybe obvious to I o psychologist, but it wasn’t to me, is that there’s this difference between between the the quality of the criterion data, so kind of and versus the the construct validity. And I had never thought about that as a in the in that framework, but it really makes sense as the kind of thing that we’re up against because with with data quality, because it comes down to really two things. The first thing is, like, whether because we generally measure our models according to, like, precision and recall when you’re identified buying cats and dogs? How many times does it identify cats correctly? How many times does identify dogs correctly?
You know, what’s kind of the false positive, false negative? Kind of ratios going on there. That’s really like the kind of the beginning and the end of the of the MO model evaluation. But that that kind of that I I think that that counts as, like, that’s just criterion data though, criterion validity. It’s not construct validity.
The construct validity is whether those dogs and cats were labeled correctly in the first place. And whether it makes sense for us to think about dogs and cats you know, in the way that we are. And, like, kind of these more, like, general questions about, you know, are we approaching just the right world?

Speaker 1: Brilliant. You have just connected the dots. To to wrap us up in a way that’s extremely understandable because they are two different things and but they need to be the same thing if that makes any sense. Right? Because otherwise, goes back to, you know, blast from the past, Landy nineteen eighty six, foundational article in our field that basically says all validity is the same thing.
Don’t try to partition it out into different labels. It’s all about that, you know, that conceptually a great article. And I’m sure anybody who’s gone through school for what we do is read it. If not, go read it. But, you know, that’s what you’ve just said connects back to that, you know, So really good stuff.
We’re running out of time. I could talk about this stuff all day. Hopefully, we’ll have a chance to Yeah. Hang out again, off camera, you know, come out come back out there sometime soon. But just let everybody know.
I say this and I say it every time. Let everybody know where they can find you. Obviously, LinkedIn is where people can find them. So plug something else you’re doing or, you know, let let let people know how to get in touch with you to tap into your expertise.

Speaker 2: Yeah. Yeah. So I yeah. So you can you can find me on LinkedIn. That’s the easiest starting point.
I do have a website with west valley dot a I. And you can email me at carine at west valley dot a I. It’s it’s all all pretty straightforward as you say. I I would really, you know, welcome conversations that people wanna have about the app locations of AI to HR tech or people analytics or, you know, maybe sort of I O fields. I I find that really interesting.
Or, you know, anything kind of kind of related, I’m really interested in getting getting a melding of the markets here going, like like we did on a very small scale today. But, you know, like, let’s let’s really, like, figure out what each other is talking about and try to make some progress together.

Speaker 1: Yeah. I think our stuff’s a little easier sometimes than than your but, you know, who knows? Well, cool. Very very great. Thank you so much.
And I’ll just play it out here. Listeners out there, you need anything related to talent assessment, ethical talent assessment, rocket hires got you covered. So So come find us and we love talking about new and different use cases, but also doing the the good old stuff like job analysis, etcetera. So look us up. We’re here to talk to you about it.
Have a good one.

Speaker 2: Thank you.

Speaker 1: As we wind down today’s episode to your listeners, I want to remind you to check out our website rockethire dot com and learn more about our latest line of business, which is auditing and advising on AI based hiring tools and talent assessment tools. Take a look at the site. There’s a really awesome some FAQs document around New York City local law one forty four that should answer all your questions about that complex and untested piece of legislation. And guess what? There’s gonna be more to come.
So check us out. We’re here to help.