Role & seniority: LLM – AI Quality Analyst (Personalization); short-term contract (2 months), part-time (30–40 hrs/week)

Stack/tools: AI quality evaluation; prompt engineering and personalization concepts; multi-turn prompt design; analysis of conversations and data sources; Google account for data sources; remote work tools

Top 3 responsibilities

Evaluate a personalization feature for Gemini and assess how past conversations and activity influence responses
Design and run multi-turn prompts; stack-rank responses by helpfulness and naturalness; write rationales tied to specific turns
Ensure data hygiene: extract/verify debug info, verify summaries/data sources usage, and delete evaluation conversations after completion

Must-have skills

Strong Spanish reading/writing proficiency
Experience in data annotation, AI quality evaluation, content moderation, or related roles
Attention to detail, structured feedback, analytical thinking
Experience with creative prompt engineering and personalization concepts
Ability to work independently in a remote environment

Nice-to-haves

Familiarity with grounding issues, hallucinations, and model evaluation metrics
Comfort enabling and using personal data sources for assessment
Experience documenting clear rationales and structured evaluations
Location & work type: Remote (global); Short-term contract, compensation $11/hour; part-time commitment

Full Description

Position: LLM – AI Quality Analyst (Personalization) – Spanish

Type: Short-Term Contract (2 months)

Compensation: $11 per hour

Location: Remote (Global)

Commitment: Part-time availability required (30–40 hrs/week)

Role Responsibilities Evaluate a personalization feature for Gemini Design and execute multi-turn conversational prompts that require the AI to utilize personal information and experiences Assess how effectively the model uses past conversations and activity to generate relevant and helpful responses Evaluate model responses based on intent and appropriate personalization Analyze responses for grounding issues, including flawed inferences or hallucinations Assess integration quality to ensure personal data is incorporated naturally into responses Perform side-by-side evaluations and stack-rank model responses based on helpfulness and naturalness Write clear rationales referencing specific conversation turns Extract and verify debug information to confirm correct use of summaries and data sources Maintain data hygiene by deleting evaluation conversations after completion

Requirements Experience in data annotation, AI quality evaluation, content moderation, or related roles Strong Spanish proficiency (reading and writing) Willingness to use a primary personal Google account and enable personal data sources for assessment Strong analytical thinking and attention to detail Experience with creative prompt engineering and personalization concepts Ability to provide structured feedback and clear written explanations Ability to work independently in a remote environment

Application Process (Takes 20 Mins) Fill out the application form Complete the ICF Complete the assessment

Quality Assurance Analyst (Spanish) | $11/hr Remote

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description