Role & seniority: LLM – AI Quality Analyst (Personalization); short-term contract (2 months); part-time availability 30–40 hrs/week.

Stack/tools: AI quality evaluation, data annotation, content moderation; creative prompt engineering; personalization concepts; evaluation of past conversations and data source usage; side-by-side evaluations and stack ranking; data hygiene and privacy (delete evaluation conversations); remote work; requires use of personal Google account for assessment.

Top 3 responsibilities

Evaluate a Gemini personalization feature and assess how personal data is used in responses.
Design/execute multi-turn prompts leveraging personal information; analyze grounding, precision, and hallucinations.
Produce structured feedback with rationales, extract/verify debug info, and maintain data hygiene by deleting evaluation conversations.

Must-have skills

Strong Chinese reading/writing proficiency
Experience in data annotation, AI quality evaluation, content moderation, or related roles
Strong analytical thinking, attention to detail
Experience with creative prompt engineering and personalization concepts
Ability to provide structured feedback and clear written explanations
Ability to work independently in a remote environment; stable internet

Nice-to-haves

Familiarity with data privacy considerations and data-source integration
Experience with evaluation frameworks, grounding checks, and debugging data sourc

Full Description

Position: LLM – AI Quality Analyst (Personalization) – Chinese

Type: Short-Term Contract (2 months)

Compensation: $11 per hour

Location: Remote (Global)

Commitment: Part-time availability required (30–40 hrs/week)

Role Responsibilities

Evaluate a personalization feature for Gemini Design and execute multi-turn conversational prompts that require the AI to utilize personal information and experiences Assess how effectively the model uses past conversations and activity to generate relevant and helpful responses Evaluate model responses based on intent and appropriate personalization Analyze responses for grounding issues, including flawed inferences or hallucinations Assess integration quality to ensure personal data is incorporated naturally into responses Perform side-by-side evaluations and stack-rank model responses based on helpfulness and naturalness Write clear rationales referencing specific conversation turns Extract and verify debug information to confirm correct use of summaries and data sources Maintain data hygiene by deleting evaluation conversations after completion

Requirements

Experience in data annotation, AI quality evaluation, content moderation, or related roles Strong Chinese proficiency (reading and writing) Willingness to use a primary personal Google account and enable personal data sources for assessment Strong analytical thinking and attention to detail Experience with creative prompt engineering and personalization concepts Ability to provide structured feedback and clear written explanations Ability to work independently in a remote environment Desktop or laptop with a stable internet connection

Application Process

Fill out the application form Complete the ICF Complete the assessment

Quality Assurance Specialist | $11/hr Remote

Top 3 responsibilities

Must-have skills

Nice-to-haves

Full Description