mardi 25 septembre 2012

Job: professeur de sémantique lexicale, Montréal

Un poste de professeur de linguistique, avec une spécialisation en sémantique lexicale, et un "intérêt démontré pour les applications de la sémantique lexicale en TAL":

Le Département de linguistique et de traduction de l'Université de Montréal sollicite des candidatures pour occuper un poste à temps plein de professeur de linguistique, avec une spécialisation en sémantique lexicale.

Voici le lien exact:
http://www.fas.umontreal.ca/affaires-professorales/postes/affichage_2012_2013/LNG-1_FRA_long_final.pdf.

lundi 24 septembre 2012

Talk @ #ParisJS: bibliothèque Fullproof

Un exposé dans le cadre de ParisJS (ParisJS est un user-group Javascript qui se réunit tous les mois, et présente les dernières technologies Web/JS du moment), présenté par un ancien du cursus LI, à propos d'une bibliothèque open-source de recherche full text, Fullproof.

Fullproof est une bibliothèque open-source fournissant une pile complète dédiée à la recherche plein texte (fulltext) en javascript, utilisant le stockage local lorsqu'il est disponible, intégrant une gestion de l'unicode, des algorithmes de lemmatisation et de simplification phonétique, et permettant ainsi des recherches textes de très bonne qualité pour les applications déconnectées. Je présenterai les enjeux, les avantages, et les limitations actuelles de cette solution.
présenté par Rodrigo Reyès


Ca peut intéresser des étudiants/personnes du cursus, s'ils ont un petit bagage minimum en javascript (sinon ça risque d'être un peu abstrait, mais pourquoi pas).                                                      
La conférence est gratuite, mais il faut s'inscrire sur le site du user group pour avoir une place: http://parisjs.org/

vendredi 21 septembre 2012

Jobs: temporary job positions at Google Speech

Une annonce de google Irlande pour laquelle ils n'ont reçu que très peu de réponses (en particulier pour l'espagnol et le français)


Job title:

Speech Linguistic Project Manager (French, German, Italian, Iberian
Spanish)

Job description:

As a Linguistic Project Manager and a native speaker of one of the target languages, you will oversee and manage all work related to achieving high data quality for speech projects in your own language.

You will be based in the Dublin office, managing a team of Data Evaluators and working on a number of projects towards Speech research: ASR, TTS, and NLP.

This includes:

- managing and overseeing the work of your team
- creating verbalisation rules, such as expanding URLs, email  addresses,
  numbers
- providing expertise on pronunciation and phonotactics
- building and maintaining a database of speech recognition patterns
- creating pronunciations for new lexicon entries, maintaining the lexicon
- working with QA tools according to given guidelines and using in-house
  tools

Job requirements:

- native-level speaker of one of the target languages (with good command
  of the standard dialect) and fluent in English
- keen ear for phonetic nuances and attention to detail; knowledge of
  the language's phonology
- must have attended elementary school in the country where the language
  is spoken
- ability to quickly grasp technical concepts
- excellent oral and written communication skills
- good organizational skills, previous experience in managing external resources
- previous experience with speech/NLP-related projects a plus
- advanced degree in Linguistics, Computational Linguistics preferred
- also a plus: proficiency with HTML, XML, and some programming
  language; previous experience working in a Linux environment

Project duration: 6-9 months (with potential for extension)

For immediate consideration, please email your CV and cover letter in English (PDF format preferred) with "Speech Linguistic Project Manager
[language]" in the subject line.

Email Address for Applications: DataOpsMan@gmail.com
Contact Person: Linne Ha
Closing date: open until filled

mercredi 19 septembre 2012

Job: offres de post-doc, ens Paris

Très intéressantes offres de post-doc, très appropriées pour des linguistes computationnels, et dans un super labo de recherche...
Date limite: 20 octobre 2012.


POSTDOCTORAL POSITIONS AT THE ECOLE NORMALE SUPERIEURE, PARIS

The Institute for Cognitive Studies at the Ecole Normale Supérieure in Paris is seeking to fill THREE POSTDOCTORAL POSITIONS within a *multidisciplinary project* at the intersection of *speech engineering, computational neuroscience, developmental psychology and computational linguistics*.
The aim of this project is to decipher how babies spontaneously learn their first language by applying a 'reverse engineering ' approach, i.e., by constructing an artificial language learner that mimics the learning stages of the infant. It uses engineering and applied math techniques (*zero resource speech recognition, signal processing, machine learning*) on large corpora of child-adult verbal interactions in several languages. It develops psychologically plausible (*unsupervised*) and biologically plausible (*bio-inspired*) algorithms that can discover linguistic categories (*words, syllables, phonemes, features*). The predictions of these algorithms are then tested against perceptual data gathered from newborns and older infants using behavioral techniques (*eye tracking*) or noninvasive brain imagery (*Near InfraRed Spectroscopy, EEGs*).
The project is hosted by the *Institute for Cognitive Studies*, which offers an international and vibrant research setting, at the heart of the *Quartier latin* in Paris. The institute has been ranked as one of the top research centers in France across all disciplines, and is one of the leading interdisciplinary centers in Cognitive Science in Europe. It contains 70 permanent researchers structured in interacting teams covering a broad range of topics, ranging from Philosophy of Consciousness to animal electrophysiology, with a strong core in cognitive psychology and language.
We are looking for young researchers able to build and maintain a high quality research program and to contribute to a growing international collaborative community in this area of quantitative cognitive science. Applicants will ideally combine:

  • a solid background in one or more of the following areas: speech/ language engineering, signal processing, statistical modeling, Bayesian methods, neuroimaging data analyses, computational neuroscience, computational linguistics,
  • considerable familiarity with general cognitive science and/or language science,
  • a documented interest for interdisciplinary and team-based research,
  • research creativity, independence, and productivity.
The positions are for two years, with salaries set at a competitive European-level (between 2400 and 2700 euro/month depending on prior experience). We will also provide generous travel funds. There is no associated university teaching load, although researchers will be able to participate in the research culture of the Institute through seminars, supervision of students and other activities. Starting dates are flexible. Women are encouraged to apply. Candidates should send a letter of motivation (2 pages max.), the contact information of 2 to 3 referees, and a CV to emmanuel.dupoux@gmail.com BY OCTOBER 20, 2012. Interviews of short-listed candidates will be conducted in the Fall either in Paris or by video-conferences.
Further information about the project can be found at: http://www.lscp.net/persons/dupoux/bootphon/index.html





mardi 18 septembre 2012

Témoignages des anciens

Nous venons de mettre en ligne une série de témoignages d'anciens étudiants du cursus LI que nous avons reçus ou qui ont été publiés au sein du groupe LinkedIn des Anciens du Cursus de Linguistique Informatique.

Nous sommes très fiers de tous ces témoignages, et je jure que je n'en ai amendé ou supprimé aucun (mais les témoignages continuent à arriver, il faut quelques jours pour les mettre en ligne).
 


lundi 17 septembre 2012

Séminaires Edward Gibson

Linguistique expérimentale, théorie de l'information, ça n'est pas vraiment du TAL, mais cela devrait intéresser les linguistes-informaticiens, et c'est une grande chance que de pouvoir bénéficier de la présence de ce grand spécialiste, invité par l'EFL et LLF pendant ce semestre.

Le Labex EFL est heureux d'annoncer le cours de Edward Gibson, MIT, Brain and Cognitive Sciences, qui commence lundi, 17 septembre, au LLF, Paris Diderot, Salle 4C92.

Cours tous les lundis du 17 septembre au 10 décembre, 16-18 h
175 rue du Chevaleret, 4e étage, salle 4C92

Résumé

The proposed seminar is inspired by Shannon's (1948) seminal work on information theory and will focus on communicative properties of language representations and processing. Three broad questions will be addressed: 1. How do communicative pressures shape the lexicon? 2. How do communicative pressures shape the syntax of languages? 3. How do communicative properties of language affect interpretation of the linguistic signal? In the first section of the course I will discuss Zipf's (1949) original ideas on the relationship between word length and frequency and recent findings that extend those ideas and show that word length is better explained by how predictable a word is in the context in which it appears than by word frequency. I will then talk about lexical ambiguity and present evidence that ambiguity is not a weakness of the language system, as has been argued by e.g., Chomsky. Instead, ambiguity is a desirable feature of any communication system because it allows for a more efficient lexicon. In particular, the same 'easy' (e.g., short or easily pronounceable) words can be reused in the language, referring to different things, because context can and does robustly disambiguate the intended meaning. I will also discuss ambiguity in the context of word learning.

In the second section of the course I will focus on language structure. A recent paradigm, in which participants are asked to gesture the meanings of simple events (Goldin-Meadow et al., 2008), has been shown to reflect word-order preferences somewhat independent of native language and is thus promising for revealing fundamental properties of our communication system and perhaps shedding light on how different word orders arose in the course of language evolution. For example, when shown a picture or animation of a boy kicking a ball, speakers of languages with either a subject-verb-object (SVO) order or a subject-object-verb (SOV) order gesture the event as subject-object-verb (i.e., 'boy', 'ball', 'kick'). This finding has been interpreted as evidence for SOV order being somehow fundamental or ?base? in how we think about events. Critically, however, we have shown that when asked to gesture meanings of 'reversible' events (e.g., a boy pushing a girl), where either entity can be the agent or the patient of the action, the gesture sequences shift to the SVO order, across languages. This finding fits well with the idea that communication takes place over a noisy channel, where the signal may get corrupted in some way. The use of SVO gesture sequences for reversible events makes the meaning more robust to noise thus maximizing the chances of the comprehender recovering the intended meaning. I will further connect these experimental findings to typological patterns across languages. The noisy-channel hypothesis makes several attested predictions about cross-linguistic variation, such as prevalence of casemarking in SOV languages, and lack of case-marking in SVO languages, suggesting that a shift from SOV to SVO word order makes the language sufficiently robust to noise. However, if a language keeps its base SOV word order, then another 'device' is needed to make the signal more robust to noise, and case-marking is one way of marking the agent and patient of an event. In the third section of the course I will talk about how communicative properties of the language system affect the interpretation of the linguistic signal. Until recently, it has been assumed that the input to the sentence comprehension mechanism is an error-free sequence of words. However, given that noise (speaker errors, perception errors, or noise in the environment) can corrupt the linguistic signal, as discussed above, the human sentence comprehension mechanism is plausibly adapted to process sentences that contain errors. A noisy-channel model predicts that the interpretation of a corrupted linguistic signal will depend on how easily recoverable the plausibly intended signal is. If the comprehender can easily explain how the original signal got corrupted, then s/he will rely on semantic cues for interpretation. If, on the other hand, there is no clear explanation for how the signal got corrupted, then the comprehender is predicted to more closely follow the actual syntactic form of the utterance in deriving an interpretation. These predictions are in contrast to those of some earlier models, according to which the final interpretation is always determined by the syntax of an utterance (cf. MacWhinney et al., 1984), even though meaning may guide initial interpretation in the face of temporary syntactic ambiguity. I will discuss several recent sets of findings in sentence comprehension that provide support for the noisy-channel model as well as re-interpret some classical findings in the context of this model.

mercredi 5 septembre 2012

Emplois du temps

[edit 17/09] Les enseignements commencent cette semaine, tous les étudiants du cursus LI sont invités à aller assister aux cours, même si leur situation administrative n'est pas réglée.

Les emplois du temps, dans une version encore très provisoire, sont visibles sur cette page.
Les autres informations sur  la rentrée sont toujours sur cette page.

A propos du cours de Programmation Fonctionnelle:
  • premier cours le vendredi 21 septembre (10h30, Amphi 2A)
  • premier td le jeudi 27 septembre (14h30, Salle 557C) 
A propos du cours d'algorithmique (M1):
  • Les créneaux de CM et de TD sont inversés:
    • CM (Choffrut): mercredi 8h30-10h30
    • TD (Fagnot) : mercredi 12h00-14h
  • Changement de salle à prévoir pour le CM de 8h30
  • Les cours et les TD commencent la même semaine, la semaine du 17 septembre.