About the team
Dialpad’s Multilingual Data Specialist will provide expertise and insight to expand our Ai features including transcription and sentiment analysis, to additional languages, currently targeting French. In this role, you’ll combine linguistics, technical skills, and project management to ensure that our multilingual features are implemented with the unique characteristics of the target language in mind while maintaining our industry-leading accuracy.
Dialpad’s Data Annotation & Testing team plays an essential role in the development of our Ai features for our suite of business communication products. This team collaborates closely with NLP, ASR, AIR & Product Management to constantly improve our customers' ability to derive valuable insights from their conversations.
Your role
As a Multilingual Data Specialist, you’ll own the collection of training, testing and development data for both ASR and NLP model development for your target languages. You’ll work closely with NLP to develop sentiment models, LLMs, PII detection and more, by using your expertise to guide model design and evaluation. You’ll also develop guidelines to ensure transcription and annotation data is accurate and useful to the data science teams.
This position reports to our Manager of Data Annotation & Testing and has the opportunity to be based in our Kitchener, ON Office. This role is majority remote with the occasional in-office day.
What you’ll do
- Participate in design meetings and ask clarifying questions to fully understand NLP & ASR team scenarios and data requirements.
- Accurately translate data requests into project requirements and select the correct approach including platform, annotators, data inputs and outputs etc. based on data needs.
- Design and implement annotation jobs including guidelines and job setup in Labelbox.
- Effectively train and provide continuous feedback to annotators.
- Ensure the quality of the data which results from your projects and directly meets the standards of the NLP & ASR teams.
- Own basic data projects throughout the project's entire lifecycle.
- Provide linguistic expertise on the different dialects of your target languages.
- Provide phonetic annotation and own the multilingual lexicons.
- Pre- and post-process data using manual and automated methods.
- Creates high quality gold sets for annotation as well as completing other annotation tasks as needed.
- Provide linguistic expertise on the intersection between our models and the target languages to identity areas where the approach needs to differ from our English implementation.
- Coordinates with QA Analyst to perform multilingual model testing.
- Understands data privacy and security requirements and applies them as needed.
- Partner closely with Applied Scientists & Ai Product Managers to get an in-depth understanding of their customer use cases and feature design.
- Research the latest developments in machine learning annotation, processes and tooling to optimize our processes and data quality.
Skills you’ll bring
- Bachelor's degree in linguistics or related work experience
- Native or advanced-level proficiency in English and French
- 1+ years of experience with linguistic annotation
- 0 - 3 years data analysis experience including SQL, and data cleaning & manipulation
- 0 - 3 years experience with basic python including pandas
- Experience with research and scientific literature reviews
- Strong desire to work in a role supporting machine learning and data science development
Dialpad benefits and perks
Benefits, time-off, and wellness
An apple a day keeps the doctor away—and it doesn’t hurt that we offer flexible time off and great options for medical, dental, and vision plans for all employees. Along with that, employees also receive a monthly stipend to help cover your cell phone bill, home internet bill, and we reimburse for gym membership costs, a variety of wellness events, and more!
Professional development
Dialpad offers reimbursement for expenses related to professional development, up to an annual limit per calendar year.