
Quality Assurance Analyst (japanese) | $11/hr Remote
Crossing Hurdles • Japan
-
Role & seniority
-
LLM – AI Quality Analyst (Personalization), Japanese
-
Short-Term Contract (2 months); Part-time (30–40 hrs/week)
-
Remote, global
-
-
Stack/tools
-
AI quality evaluation, data annotation, content moderation concepts
-
Creative/personalization prompt engineering
-
Multi-turn prompt design, evaluation workflows, rationale writing
-
Required: primary Google account usage and access to personal data sources for assessment
-
-
Top 3 responsibilities
-
Evaluate a Gemini personalization feature and model responses for intent and personalization
-
Design and execute multi-turn prompts leveraging personal information and past conversations
-
Perform side-by-side evaluations, rank responses by helpfulness/naturalness, and write clear rationales with turn references; verify debug data and ensure proper use of summaries/data sources; delete evaluation conversations after completion
-
-
Must-have skills
-
Strong Japanese reading/writing proficiency
-
Experience in data annotation, AI quality evaluation, or content moderation
-
Excellent analytical thinking, attention to detail
-
Experience with creative prompt engineering and personalization concepts
-
Ability to provide structured, clear written feedback; able to work independently remote
-
-
Nice-to-haves
-
Familiarity with grounding/hallucination issues in AI
-
Comfort with using personal data sources and Google account for assessments
-
Prior remote/global work experience
-
-
Locati
Full Description
Position: LLM – AI Quality Analyst (Personalization) – Japanese
Type: Short-Term Contract (2 months)
Compensation: $11 per hour
Location: Remote (Global)
Commitment: Part-time availability required (30–40 hrs/week)
Role Responsibilities Evaluate a personalization feature for Gemini Design and execute multi-turn conversational prompts that require the AI to utilize personal information and experiences Assess how effectively the model uses past conversations and activity to generate relevant and helpful responses Evaluate model responses based on intent and appropriate personalization Analyze responses for grounding issues, including flawed inferences or hallucinations Assess integration quality to ensure personal data is incorporated naturally into responses Perform side-by-side evaluations and stack-rank model responses based on helpfulness and naturalness Write clear rationales referencing specific conversation turns Extract and verify debug information to confirm correct use of summaries and data sources Maintain data hygiene by deleting evaluation conversations after completion
Requirements Experience in data annotation, AI quality evaluation, content moderation, or related roles Strong Japanese proficiency (reading and writing) Willingness to use a primary personal Google account and enable personal data sources for assessment Strong analytical thinking and attention to detail Experience with creative prompt engineering and personalization concepts Ability to provide structured feedback and clear written explanations Ability to work independently in a remote environment
Application Process Upload resume Interview Submit form