Role & seniority: LLM – AI Quality Analyst (Personalization), short-term contract; remote work.

Stack/tools: data annotation/evaluation workflows; prompt engineering for multi-turn conversations; side-by-side (SxS) evaluation; documentation of rationales; management of “Debug Info” data sources; primary Google account with enabled personal data sources; reliable desktop/laptop with internet.

Top 3 responsibilities

Design multi-turn prompts (1–5 turns) using user personal context.
Evaluate personalized AI responses for grounding, integration, helpfulness, and natural personalization; perform SxS comparisons.
Write structured rationales referencing specific turns; extract/verify Debug Info; maintain data hygiene by deleting evaluation conversations.

Must-have skills

Proficiency in Turkish (reading/writing).
Experience in data annotation, AI quality evaluation, content moderation, or related roles.
Strong analytical thinking for nuanced AI judgments; attention to detail.
Experience with creative prompt engineering and multi-turn conversations.
Understanding of personalization concepts and evaluation methodologies.
Excellent written communication and feedback documentation.
BS/BA or equivalent experience; self-mmotivated, able to work remotely; reliable internet and a suitable device.
Willingness to use a primary Google account with data sources.

Nice-to-haves

Prior work on multi-turn dialogue systems and SxS evalua

Full Description

Position: LLM – AI Quality Analyst (Personalization) – Turkish

Type: Short-Term Contract

Location: Remote

Commitment: 30–40 hours/week, 4-hour overlap with PST

Engagement Length: 2 months

Start Date: Immediate

Role Responsibilities Design multi-turn conversational prompts (1–5 turns) using personal context Evaluate personalized AI responses for grounding, integration, and helpfulness Assess whether personalization is applied correctly and naturally Identify flawed inferences, hallucinations, and incorrect personalization Review integration quality to ensure responses are not robotic or over-narrated Conduct side-by-side (SxS) evaluation and ranking of model responses Analyze subtle differences in tone, clarity, and usefulness Write clear, structured rationales referencing specific conversation turns Extract and verify “Debug Info” to confirm correct usage of data sources Maintain strict data hygiene by deleting evaluation conversations

Requirements Turkish proficiency (reading and writing) Strong experience in data annotation, AI quality evaluation, content moderation, or related roles Strong analytical thinking for evaluating nuanced and ambiguous AI responses Experience with creative prompt engineering and multi-turn conversations Understanding of personalization concepts and evaluation methodologies High attention to detail for SxS comparisons Excellent written communication and feedback documentation skills BS/BA degree or equivalent experience in a relevant field Willingness to use a primary personal Google account with enabled personal data sources Self-motivated and able to work independently in a remote environment Desktop/laptop with reliable internet connection

Application Process Fill out the application form Complete the ICF Complete the assessment

Quality Assurance Analyst (Turkish) | $11/hr - Remote

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description