281_25_LS_LT_RE1
Position
Data Linguist for Language Technologies (RE1)
Closing Date
Tuesday, 22 April, 2025
Reference: 281_25_LS_LT_RE1
Job title: Data Linguist for Language Technologies (RE1)
About BSC
The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, was a founding and hosting member of the former European HPC infrastructure PRACE (Partnership for Advanced Computing in Europe), and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.
Look At The BSC Experience
BSC-CNS YouTube Channel
Let's stay connected with BSC Folks!
We are particularly interested for this role in the strengths and lived experiences of women and underrepresented groups to help us avoid perpetuating biases and oversights in science and IT research. In instances of equal merit, the incorporation of the under-represented sex will be favoured.
We promote Equity, Diversity and Inclusion, fostering an environment where each and every one of us is appreciated for who we are, regardless of our differences.
If you consider that you do not meet all the requirements, we encourage you to continue applying for the job offer. We value diversity of experiences and skills, and you could bring unique perspectives to our team.
Context And Mission
The Language Technologies Unit at BSC has consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan governments with the mission to develop fundamental open-source resources and technologies for Spanish and Catalan. In connection with this, the LT Unit is currently in charge of two flagship projects at the national and regional level: the ALIA project, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU funded international projects.
The LT Unit at BSC is looking for a junior research engineer with a background in Applied Linguistics and Language Technologies.
The successful candidate will join a dynamic research environment focused on the training and evaluation of large language models (LLMs). As part of the Data Team, they will contribute to the processing, management, and organization of data used for LLM training, ensuring compliance with legal and ethical standards. They will also support the evaluation of language models through data-driven approaches.
Key Duties
- Collaborate with team members on data collection, cleaning, and preprocessing for LLM training.
- Assist in managing and organizing large-scale multilingual datasets, ensuring data integrity and accessibility.
- Support the implementation of data governance policies to ensure the legal and ethical use of language data.
- Work with other computational linguists and engineers to develop data pipelines and preprocessing workflows.
- Contribute to documentation of data processing methodologies to ensure reproducibility and transparency.
- Assist in evaluating the quality and suitability of datasets for language model development.
- Education
- Master’s degree in Computational Linguistics, Theoretical and Applied Linguistics, or a related discipline.
- Essential Knowledge and Professional Experience
- Knowledge of Python and experience working with NLP-related libraries such as NLTK and pandas.
- Strong analytical and problem-solving skills, particularly in data analysis and linguistic evaluation.
- Strong understanding of linguistic concepts.
- Ability to work effectively in a collaborative research environment.
- Fluency in spoken and written Spanish and English.
- Additional Knowledge and Professional Experience
- Familiarity with deep learning techniques and their application to NLP.
- Experience with language data preprocessing and linguistic annotation.
- Understanding of evaluation metrics for NLP models, such as accuracy, BLEU, and F1 score.
- Experience with tools for version control, such as Git and GitHub/GitLab.
- Native or good level of spoken and written Catalan.
- Competences
- Strong organizational and documentation skills.
- Attention to detail and a proactive approach to problem-solving.
- Ability to work both independently and within a team.
- Critical thinking and adaptability in a fast-paced research setting.
- Good communication and presentation skills.
- Ability to work under set deadlines.
- The position will be located at BSC within the Life Sciences Department
- We offer a full-time contract (37.5h/week), a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures
- Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration
- Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement
- Salary: we offer a competitive salary commensurate with the qualifications and experience of the candidate and according to the cost of living in Barcelona
- Starting date: asap
All applications must be submitted via the BSC website and contain:
- A full CV in English including contact details
- A cover/motivation letter with a statement of interest in English, clearly specifying for which specific area and topics the applicant wishes to be considered. Additionally, two references for further contacts must be included. Applications without this document will not be considered.
The selection will be carried out through a competitive examination system ("Concurso-Oposición"). The recruitment process consists of two phases:
- Curriculum Analysis: Evaluation of previous experience and/or scientific history, degree, training, and other professional information relevant to the position. - 40 points
- Interview phase: The highest-rated candidates at the curriculum level will be invited to the interview phase, conducted by the corresponding department and Human Resources. In this phase, technical competencies, knowledge, skills, and professional experience related to the position, as well as the required personal competencies, will be evaluated. - 60 points. A minimum of 30 points out of 60 must be obtained to be eligible for the position.
In accordance with OTM-R principles, a gender-balanced recruitment panel is formed for each vacancy at the beginning of the process. After reviewing the content of the applications, the panel will begin the interviews, with at least one technical and one administrative interview. At a minimum, a personality questionnaire as well as a technical exercise will be conducted during the process.
The panel will make a final decision, and all individuals who participated in the interview phase will receive feedback with details on the acceptance or rejection of their profile.
At BSC, we seek continuous improvement in our recruitment processes. For any suggestions or comments/complaints about our recruitment processes, please contact [email protected].
For more information, please follow this link.
Deadline
The vacancy will remain open until a suitable candidate has been hired. Applications will be regularly reviewed and potential candidates will be contacted.
OTM-R principles for selection processes
BSC-CNS is committed to the principles of the Code of Conduct for the Recruitment of Researchers of the European Commission and the Open, Transparent and Merit-based Recruitment principles (OTM-R). This is applied for any potential candidate in all our processes, for example by creating gender-balanced recruitment panels and recognizing career breaks etc.
BSC-CNS is an equal opportunity employer committed to diversity and inclusion. We are pleased to consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or any other basis protected by applicable state or local law.
For more information follow this link
Application Form
You are applying for the following job offer
Name and Surname *
Gender ** *
Female
Male
Other
Email *
Nationality** *
Where did you first see this job offer (Please indicate the name of the website, social media, referral etc.)? *
please choose one of this and if needed describe the option : - BSC Website - Euraxess - Spotify - HiPeac - LinkedIn - Networking/Referral: include who and how - Events (Forum, career fairs): include who and how - Through University: include the university name - Specialized website (Metjobs, BIB, other): include which one - Other social Networks: (Twitter, Facebook, Instagram, Youtube): include which one - Other (Glassdoor, ResearchGate, job search website and other cases): include which one
Indicate what BSC department/s you want to apply.
Computer Sciences
CASE
Life Sciences
Earth Sciences
Indicate what research group/s you want to apply.
Upload CV (select the file, then click the Upload button) *
Please, upload your CV document using the following name structure: Name_Surname_CV
Files must be less than 3 MB.
Allowed file types: txt rtf pdf doc docx.
Cover Letter (optional) (if so, select the file and then click the Upload button)
Please, upload your CV document using the following name structure: Name_Surname_CoverLetter
Files must be less than 3 MB.
Allowed file types: txt rtf pdf doc docx zip.
Other Documents (optional) (if so, select the file and then click the Upload button)
Please, upload your CV document using the following name structure: Name_Surname_OtherDocument
Files must be less than 10 MB.
Allowed file types: txt rtf pdf doc docx rar tar zip.
- Consider that the information provided in relation to gender and nationality will be used solely for statistical purposes.
I accept the data policy *
Other: *
I confirm that the information given in this form is true, complete and accurate.
Leave this field blank
Ver más
¡No te pierdas nada!
Únete a la comunidad de wijobs y recibe por email las mejores ofertas de empleo
Nunca compartiremos tu email con nadie y no te vamos a enviar spam
Suscríbete AhoraÚltimas ofertas de empleo de Desarrollo de Software en Barcelona
Vall d'Hebron Institute of Oncology (VHIO)
Barcelona, ES
Sanofi
Barcelona, ES
blit.studio
Barcelona, ES
Europastry
Sant Cugat del Vallès, ES
Security Engineer
18 abr.Wizeline
blit.studio
Barcelona, ES
Data Scientist
18 abr.MAM Gruppe
Barcelona, ES
Software Engineer - Evinova
18 abr.AstraZeneca
Barcelona, ES
Amazon
Barcelona, ES
Production Engineer
17 abr.QIAGEN
Barcelona, ES