Offre de stage
Stage : Stage 5 mois : analyse statistique des tests d'effort pour l'amélioration du diagnostic patient
Candidature avant : 31/01/2025
Modalité :
Par mail à paul.chauchat@lis-lab.fr et luca.thiebaud@lis-lab.fr
Résumé :
Ce stage s'inscrit dans un projet de recherche visant à analyser les données issues de tests
d’effort cardio-pulmonaire (CPET). Les CPET sont utilisés pour évaluer les réponses physiologiques d'un patient lors d'un exercice maximal. L’objectif de ce stage est de contribuer à l’exploitation des données issues des tests d’effort cardio-pulmonaire (CPET) à travers une approche de Network Physiology.
Stage : multimodal conversation script generation
Candidature avant : 15/12/2024
Modalité :
Send CV, transcripts, application letter to Benoit Favre
Résumé :
The MINERAL
ANR project aims at generating enriched representations of multiparty conversations in the
form of a conversation script similar to a movie script or a play script.
The goal of this internship is twofold:
1) Propose an evaluation methodology for assessing the quality of a generated script
2) Build and assess baselines leveraging disjoint building blocks such as speech
transcription and automated description of video scenes
Stage : Commande sans modèle explicite d'un véhicule sous-marin
Candidature avant : 01/04/2024
Modalité :
Pour postuler : Envoyer un CV, une lettre de motivation, des relevés de notes à moussaal@univ-tln.fr
Résumé :
L'objectif principale de ce travail est d'imaginer et développer, en s'inspirera de certains outils et développements de la théorie de l'estimation algébrique, une méthode totalement nouvelle de commande sans modèle explicite d'un véhicule sous-marin avec une configuration à 6 propulseurs vectorisés.
Stage : Predictive multimodal model of dyadic conversations
Candidature avant : 01/02/2024
Modalité :
This internship is not available anymore,
Résumé :
The goal of this MSc-level internship is to develop a model of multimodal conversations with two participants. The model will predict discretized speech and video representations of participants\' behavior from the history. The idea is to go beyond turn-taking limitations of current approaches to modeling conversations, and exploit the results for (1) synthesizing robotic actions, (2) better understand cognitive science aspects of conversational behavior. Auto-supervised transformers will be trained on a large set of audio-video recordings of dyadic conversations.
Stage : Deep transfer knowledge from speech to primate vocalizations
Candidature avant : 01/02/2024
Modalité :
This internship is not available anymore.
Résumé :
The goal of this internship is to study the ability of large pre-trained models to transfer knowledge from speech to animal vocal communication. In the recent years, computational bioacoustics (i.e., the study of animal sounds with machine learning) is showing increasing interest towards pre-trained self-supervised models inspired from sound and speech processing. In this context, the internship would aim at testing the ability of speech-based models to extract meaningful representations of primate vocalisations in few-shot learning scenarios. More specifically, the candidate will implement efficient fine-tuning and adversial reprogramming solutions on several pre-trained models and analyse their respective performances in detecting, identifying and classifying Gibbon vocalisations.
Stage : LLM adaptation to the biomedical domain
Candidature avant : 01/02/2024
Modalité :
This internship is not available anymore
Résumé :
The goal of this internship is to propose low-cost methods for adapting general-purpose LLMs that are able to sacrifice general knowledge while acquiring medical knowledge. To this end, the method will have to locate general knowledge in an LLM and overwrite it with domain-specific information.
Stage : Deep learning pour la caractérisation fine de la structure de galaxies
Candidature avant : 01/04/2023
Modalité :
Envoyer un CV et lettre de motivation à Adeline Paiement : adeline.paiement@lis-lab.fr
Résumé :
Stage de 3 à 6 mois proposé dans l'équipe DYNI. Projet multidisciplinaire avec partenariat de l'observatoire astronomique de Strasbourg. La date exacte de début du stage dépendra de la date d'établissement de la convention (2 mois nécessaires à partir de la candidature).
Stage : Tensor learning for color and polarimetric imaging
Candidature avant : 01/03/2023
Modalité :
Contact Julien Flamant (julien.flamant@cnrs.fr) and Yassine Zniyed (zniyed@univ-tln.fr).
Résumé :
The candidate should be enrolled in a M1/M2R or engineer diploma in one or more of the following fields: signal and image processing, machine learning, applied mathematics. The candidate should have good writing and oral communication skills.
Stage : Text Summarisation with Quantum Natural Language Processing
Candidature avant : 01/02/2023
Modalité :
This internship is not available anymore.
Résumé :
Quantum natural language processing (QNLP) is the use of quantum computing to solve NLP tasks faster than any classical computer. The aim of the internship is to apply QNLP to the problem of automatic text summarisation. The student will design quantum algorithms, investigate their asymptotic speedup compared to classical ones and implement proof-of-concept experiments to evaluate them.
See https://alexis.toumi.xyz/jobs/22-11-16-qnlp-summarisation
Stage : Génération automatique de comportements sociaux de personnages virtuels à partir de réseaux adverses génératifs (GAN)
Candidature avant : 01/02/2023
Modalité :
Email à Magalie Ochs et Stéphane Ayache (first.last@lis-lab.fr)
Résumé :
L’objectif de ce stage de master est de mettre en place une architecture GAN permettant la
génération automatique de comportements multimodaux socio-émotionnels pour des
personnages virtuels.
Ce stage s’intègre dans plusieurs projets en cours visant la simulation d’attitudes sociales sur
des acteurs virtuels (e.g. persuasif, agressif, conciliant) : le projet TRUENESS qui vise à
développer une plateforme de réalité virtuelle peuplée de personnages virtuel pour former les
individus à lutter contre la discrimination sociale (genrée et ethnique) et le projet ANR
COPAINS qui vise à développer un personnage virtuel persuasif pour inciter les personnes
âgées à faire du sport.
Stage : Modèles psycholinguistiques pour la segmentation en mots
Candidature avant : 01/02/2023
Modalité :
Email à Alexis Nasr et Arnaud Rey (first.last@univ-amu.fr)
Résumé :
La segmentation en mots est la tˆache qui consiste à segmenter un signal
acoustique en segments correspondant à des mots. Il s’agit d’une tˆache complexe qui suppose l’utilisation d’un grand nombre d’indices, acoustiques, lexicaux, syntaxiques, s ́emantiques . . .
[Saffran et al., 1996] ont montré à l’aide d’expériences psycholinguistiques simples que l’être humain était capable d’effectuer une segmentation en mots à partir d’un signal beaucoup plus pauvre, en utilisant simplement des régularités statistiques.
[Perruchet and Vinter, 1998] ont proposé un algorithme simple, appelé "parser", de segmentation en mots qui permet de reproduire une partie des comportements observées par [Saffran et al., 1996] sur des être humains.
L’objet du stage est de partir du modèle proposé par [Perruchet and Vinter, 1998] et de le faire évoluer.
Stage : Multimodal Vision-Language Pretraining
Candidature avant : 01/02/2023
Modalité :
This internship is not available anymore.
Résumé :
The goal of this internship is to study the multimodal pretraining of transformer-based Vision-Language models. Most models are pretrained using an image-text matching task, while some also have an additional multimodal task such as word-region alignment, like UNITER.
However, multiple studies have shown the weaknesses of state-of-the-art language models have. For example, some multimodal concepts which are less represented in the data are harder to extract. In addition, fine-grained multimodal dependencies are hard to understand, especially for vision-language models trained on large noisy datasets with basic multimodal pretraining. More specifically, the goal is to study self-supervised multimodal pretraining tasks and their impact on the ability of a model to extract multimodal information.
Stage : Traitement morpho-syntaxique du Latin
Candidature avant : 01/02/2023
Modalité :
Email à Alexis Nasr (first.last@lis-lab.fr)
Résumé :
Ce stage s’inscrit dans le cadre d’une collaboration avec le laboratoire CIE-
LAM. Il vise à proposer des outils de TAL pour le Latin, plus particulièrement, un étiqueteur en parties de discours, un analyseur morphologique, un lemmatiseur et un analyseur syntaxique. On s’intéressera en particulier à une caractéristique du Latin qui est son riche système de déclinaisons et la souplesse de l’ordre des mots dans la phrase.
Stage : Leveraging recent NLP techniques for the study of child language acquisition
Candidature avant : 01/02/2023
Modalité :
Candidates should contact Mitja Nikolaus and Abdellah Fourtassi for more information about the internship (first.last@univ-amu.fr)
Résumé :
In our group, we use state-of-the-art machine learning on multimodal child-caregiver conversations to
study various aspects of language acquisition. This project is focusing on one specific phenomenon of
communication: Backchannel (BC) responses. In conversations, BCs are short non-intrusive responses
(“uh-huh”, “yeah”) that signal attention and/or understanding from the listener's side. In that way, they form a crucial component for achieving mutual understanding of the interlocutors.
Recently, we proposed that BCs have an important role as feedback signals for children learning language (Nikolaus & Fourtassi 2023). In order to test this hypothesis on a larger scale, this project aims to build models for the automatic detection of BCs in child-caregiver conversations.
Stage : A multimodal approach to study convergence phenomena in natural conversations
Candidature avant : 01/02/2023
Modalité :
Send an email to Leonor Becerra (leonor.becerra@lis-lab.fr), Philippe Blache (philippe.blache@univ-amu.fr) and Eliot Maës (eliot.maes@lis-lab.fr) with a CV, a cover letter and an academic transcript.
Résumé :
We propose in this project to analyze an existing dataset named Badalona-EPSN containing audio, video and cerebral signals recorded in a natural situation. This corpus is enriched with automatic annotations in the linguistic and gestural domains. The goal of the project is to start exploring alignment/entrainment at different modalities (lexicon, syntax, discourse) and study their possible correlation with the neurophysiological signal.
Stage : Understanding users profiles by analyzing massive non structured conversations on a drug users web community
Candidature avant : 01/01/2023
Modalité :
This internship is not available anymore.
Résumé :
In spite of storing information from almost a million of users, psychoatif.org lacks structured metadata
describing users demographics, habits, goals and problems. The objective of this internship is to infer user
profiles from the textual information published and read by the users in their posts, as well as their
interactions in the virtual community. This internship takes place in the context of a larger project, AI4DU,
which will exploit the generated profiles in order to better understand the community and eventually help the
community better face drug-related problems.
Stage : Méthodologie de Modélisation, contrôle et commande d’une flottes d’engins Mobiles
Candidature avant : 01/03/2022
Modalité :
Ce stage est destiné aux candidats ayant eu une
formation spécifique en automatique avec des Connaissance des Outils informatiques python
et Matlab
Les candidats peuvent contacter Prof. N. K. M’Sirdi (nacer.msirdi@lis-lab.fr) : CV et lettre de motivation plus entretien
Résumé :
Le sujet concerne la commande d’une flotte d’engins mobiles qu’on veut autonomes et dotés d’un caractères de fonctionnement et d’intelligence. Cette problématique soulève plusieurs verrous scientifiques en matière
d’observabilité, de prédiction, de manœuvrabilité et de commandabilité.
Stage : Joint speech segmentation and syntactic analysis
Candidature avant : 01/02/2022
Modalité :
Send a CV and cover letter to benoit.favre@lis-lab.fr & alexis.nasr@lis-lab.fr before November 1st, 2021.
Résumé :
Segmenting speech transcripts is difficult due to the lack of punctuation in automatically generated transcripts. Syntactic analysis of the spoken message might really help assessing the validity of the proposed sentence sequence, but syntactic parsing is often performed after segmentation. The goal of this internship is to develop a joint model of syntactic parsing and sentence segmentation for spoken recordings, based on lexical and prosodic features. A shift-reduce parser will be modified to perform the joint task, and account those specific inputs. Experiments will be carried out on a large corpus of segmented speech annotated with syntax.
Stage : Syntactic analysis of speech without transcription
Candidature avant : 01/02/2022
Modalité :
This internship is not available anymore.
Résumé :
Syntactic analysis, or syntactic parsing, consists in predicting a tree representationof the syntactic relationship between words of a sentence. When processing speech, syntactic parsing requires a word sequence typically generated with automatic speech transcription. The goal of this internship is reconsider this fundamental assumtion and generate a syntactic representation of spoken utterences without having access to a word transcript. Instead the proposition is to explore unsupervised clustering of acoustic units as input to a syntactic parser. The internship will involve extracting symbolic representations from the raw speech signal and pre-training a shift-reduce syntax parser on large quantities of speech recordings.
Stage : Using deep learning to study children’s multimodal behavior in face-to-face conversation
Candidature avant : 01/02/2022
Modalité :
Email Abdellah Fourtassi (abdellah.fourtassi@univ-amu.fr) before November 1st, 2021 if possible
Résumé :
The study of how children develop their conversational skills and how these skills help them learn from others is an important scientific frontier at the crossroad of social, cognitive, and linguistic development with important applications in health (e.g., mitigating communicative difficulties), education (e.g. improving teaching practices), and child-oriented AI (e.g., virtual learning companions). Recent advances in Natural Language Processing and Computer Vision allow going beyond the limitations of traditional research methods in the lab and advance formal theories of conversational development in real-life contexts. In this internship, we will leverage some of these recent techniques (e.g., multiscale recurrent neural network) to build a model that mimics how children behave in face-to-face conversations with their caregivers and how this behavior develops across middle childhood.
Stage : Using interpretability methods to explain Vision-Language models for medical applications
Candidature avant : 01/02/2022
Modalité :
This internship is not available anymore.
Résumé :
Recent developments in Vision-Language multimodal transformers have allowed a variety of novel applications that mix images and texts. However, such models convey little explainability which is a problem in the medical domain. The goal of this internship is to develop multimodal black-box explainability methods that can give users of Vision-Language models rich insight about how such models make decisions for a given instance.
Stage : Impact of language evolution in historical texts on NLP models
Candidature avant : 01/02/2022
Modalité :
This internship is not available anymore.
Résumé :
Research in digital humanities often require processing large sets of historical documents, that are characterized by a high degree of language variation. This variation is mainly due to how language has evolved over the last centuries, due to political will and normation efforts. In this context, natural language processing systems, often trained on current language, tend to be affected by language variation and have poor performance. The goal of this internship is to study the effect of language variation on NLP performance, and propose approaches in order to limit that effect.
Stage : Deep learning for speech perception
Candidature avant : 01/02/2022
Modalité :
Send a CV and cover letter to ricard.marxer@lis-lab.fr before Nov 1st, 2021
Résumé :
The goal of this internship is to produce the first DL-based models that predict human intelligibility at the sublexical level. This translates into predicting the positions in the audio stimuli where confusions will occur, and the type of confusions, in other words, which phonemes are confused with which others on an individual basis.
Stage : Simplification de textes via l’identification de passages faisant référence à des informations implicites et l’estimation d’une similarité stylistique
Candidature avant : 01/02/2022
Modalité :
CV et lettre de motivation à envoyer à patrice.bellot@univ-amu.fr et liana.ermakova@univ-brest.fr avant le 30 novembre 2021
Résumé :
Ce stage se propose d'étudier deux aspects de la simplification de textes : la détection de passages non suffisamment explicites pour une compréhension aisée et l'identification de caractéristiques stylistiques importantes à la perception de la "tonalité" du texte. Il s'inscrit dans le cadre du projet collaboratif international SimpleText (https://simpletext-madics.github.io/). Les approches proposées seront issues du traitement automatique des langues et de la recherche d'information (approches statistiques et neuronales combinées à des ressources linguistiques).
Le stage sera co-encadré par P. Bellot (Marseille) et L. Nurbakova (Brest).
Stage : Recherche de contenus vidéo à partir de requêtes thématiques et émotionnelles
Candidature avant : 01/02/2022
Modalité :
CV et lettre de motivation à envoyer à patrice.bellot@univ-amu.fr, elisabeth.murisasco@lis-lab.fr et emmanuel.bruno@lis-lab.fr
Résumé :
Ce stage concerne les domaines de l’informatique émotionnelle, de la recherche d’information et du traitement automatique des langues. Les approches cibles sont celles des modèles statistiques de la recherche d’information, de l’apprentissage automatique profond, de la fusion d’information et de données, de la communication humain-machine.
On s’intéresse plus particulièrement à une recherche d’information qualifiée d’émotionnelle en ce sens que la requête utilisateur exprime le besoin de trouver des documents qui évoquent un thème avec une coloration émotionnelle précise (peur, joie, dégoût, surprise...). Les documents sont des vidéos pour lesquelles l’on dispose des transcriptions des paroles prononcées ainsi que de logiciels permettant d’analyser les expressions faciales.
Stage : L'IA pour décoder les émotions dans le cerveau
Candidature avant : 01/02/2022
Modalité :
CV et lettre de motivation à leonor.becerra@lis-lab.fr, philippe.blache@univ-amu.fr et eliot.maes@lis-lab.fr avant le 05/12/2021
Résumé :
Le sujet du stage consiste à proposer un modèle multimodal des émotions appris sur un corpus de conversation existant, K-EmoCon, et à en valider la pertinence en le corrélant au signal neuro-physiologique.
Stage : Détection et quantification des mouvements répétitifs d’enfants avec autisme à l’aide d’un réseau de neurones profond
Candidature avant : 01/02/2022
Modalité :
This internship is not available anymore.
Résumé :
L’objectif de ce stage est de développer une technique non-intrusive d’analyse des mouvements répétitifs pour la détection des symptômes liés au spectre de l'autisme. Il s'agit d'adapter la méthode RepNet proposée par Dwibedi et al. pour la détection de répétitions dans les vidéos à des entrées sous la forme d'un squelette 3D dérivé d'un capteur vidéo et profondeur. Des méthodes de calcul de similarité entre courbes multi-paramétriques et des stratégies d'augmentation de données devront être développées.
Stage : Matching contextual and definitional embeddings for a sense-aware reading assistant
Candidature avant : 30/11/2021
Modalité :
Email Carlos Ramisch et Alexis Nasr (first.last@lis-lab.fr) before November 30st, 2021
Résumé :
Imagine you are reading a book in a foreign language that you understand quite well, but you are not totally fluent in. At some point, you come across a word that you do not understand in a sentence. Imagine you can click on the word in your screen and its definition shows up. The goal of this internship is to **develop and evaluate an original NLP model capable of aligning a word's context with its correct definition, even if the word is ambiguous, i.e., having more than one definition listed in the dictionary.**
This internship will take place in the context of the recently funded ANR SELEXINI project. The project aims at developing lexicon induction methods to create a large structured semantic lexicon for French. One of the by-products of this internship is a large French corpus with corresponding contextual embeddings aligned to Wiktionary entries. The intern will join the TALEP team in Luminy, Marseille, and have the opportunity to interact with researchers in the partner universities (Univ. de Saclay, Univ. de Paris, Univ. de Lorraine) and submit a paper to an international conference, depending on the results of the internship.
Stage : Representation Learning for Text Mining Tasks
Candidature avant : 30/11/2021
Modalité :
Master 2-Informatique
NLP, Deep-Learning, Relational Learning, Hybrid Approaches, Relation Extraction
Contact : Bernard.espinasse@lis-lab.fr
Résumé :
Text mining increasingly uses Deep Learning techniques for Natural Language Processing (NLP) tasks such as information extraction (named entity recognition and relation extraction) or higher-level tasks such as text simplification, and automatic text summarization.
Such deep learning techniques are based on many neural network architectures, including Convolutional (CNN), Recurrent (RNN), and Long Short Term Memory Neural Networks (LSTM), and more recently Transformers with BERT (Bidirectional Encoder Representations from Transformers), that allow to reach impressive results in many NLP task.
However, as demonstrated by recent studies such performance can be improved by mainly integrating linguistic features such as syntactic dependencies (Espinasse et al., 2019). In addition, other symbolic NLP-based techniques make better use of linguistics and external semantic resources (ontologies), including the use of relational learning as in (Lima et al., 2019) (Verbeke et al., 2014). In order to go beyond the limits of deep learning techniques, their combination with these symbolic techniques seems to be beneficial.
This research work will address recent advances in representation learning (Škrlj et. al., 2021), a cutting-edge research area of machine learning. Representation learning refers to modern data transformation techniques that convert data of different modalities and complexity, including texts, graphs, and relations, into compact tabular representations, which effectively capture their semantic properties and relations.
More particularly, this Master's internship will focus on new hybrid software solutions combining two approaches for symbolic and embedding representation (Lavrac et al., 2021) (Škrlj et. al., 2021) propositionalization approaches, established in relational learning and inductive logic programming, and (ii) embedding approaches, which have gained popularity with recent advances in deep learning.
After having better identified the interest and limitations of these new hybrid approaches based on representation learning techniques, their implementation will be evaluated on specific tasks such as the named entity recognition, and/or relation extraction