mardi 10 mars 2015

Bourse Post-doctorale en analyse d'émotion, Galway

L'Unit for Natural Language Processing du centre d'analyse de données de l'université nationale d'Irlande à Galway offre une bourse post-doctorale en analyse d'émotion, date limite de candidature le 22 mars prochain.

Open Position: Postdoctoral Researcher in Emotion Analysis

The Unit for Natural Language Processing [1] of the Insight Centre for Data Analytics [2] at the National University of Ireland, Galway [3], invites applications for a postdoctoral position in emotion analysis.

The position is mainly associated with the EU funded project MixedEmotions, which will implement an integrated Big Linked Data platform for emotion analysis across heterogeneous data sources, languages and modalities. The MixedEmotions platform will provide an integrated solution for: i) large-scale emotion analysis and fusion on heterogeneous, multilingual, text, speech, video and social media data streams, leveraging open access and proprietary data sources, exploiting also social context by leveraging social network graphs; ii) semantic-level emotion information aggregation and integration through robust extraction of social semantic knowledge graphs for emotion analysis along multidimensional clusters. The platform will be developed and evaluated in the context of three cross-domain Pilot Projects that are representative of a variety of data analytics markets: Social TV, Brand Reputation Management, and Call Centre Operations.

The open position will be concerned primarily with the text mining aspects of the project, in particular emotion analysis from and across text in multiple languages. Candidates should have a PhD degree in a relevant field of study with an emphasis on areas such as text mining, natural language processing, computational linguistics, machine learning etc. The position is for a fixed period of up to 24 months with the possibility of extension, depending on successful acquisition of project funding in which the selected candidate will be expected to actively participate.

The successful applicant will be required to travel to project meetings at various locations in Europe, and will need to collaborate and cooperate with other project partners, from various disciplines.

Salary range: €37k – €42k per annum, (commensurate with qualifications and experience)

Please send your application (CV, cover letter, both in PDF only) before the closing date of March 22nd, 2015 to Dr. Paul Buitelaar:


vendredi 6 mars 2015

Journée des anciens du cursus 2015

L'édition 2015 de la journée des anciens du cursus LI est prévue pour le samedi 11 avril, à la Halle aux farines, salle 247E (pour y accéder le plus simplement, choisir l'escalier C en entrant dans le bâtiment par le 16 rue Françoise Dolto).

Au programme, comme l'année dernière, des exposés d'anciens du cursus, des exposés de professionnels du domaine, et un exposé invité.
Le programme est en cours de réalisation, sous la responsabilité d'Adrien Roux (M1 LI) et Olga Seminck (M2 LI). Si vous avez des suggestions ou des questions, vous pouvez vous adresser à eux (m'envoyer un message, je transmettrai).
[edit: le programme est en ligne]

jeudi 5 mars 2015

Exposé : Pierre Magistry, Stratification du réseau lexical du hokkien de Taïwan

Un exposé d'un ancien du cursus LI, qui applique des méthodes de TAL assez puissantes (construction semi-automatique de graphes lexicaux) dans une problématique de linguistique historique. C'est mercredi prochain 11 mars, au CRLAO, de 16h à 18h, à l'INaLCO, Salle des Plaques, 2, rue de Lille, 75007 Paris.

Pierre Magistry (Paris Diderot):  
Stratification du réseau lexical du hokkien de Taïwan

Taiwanese Hokkien presents strong evidences of multiple historical layers of borrowing through the pervasiveness of 「多音字」duoyinzi, sinograms with multiple readings. For a given sinogram, traditional analysis distinguish between wenduyin 文讀音 and baiduyin 白讀音 (so-called "literary" and "colloquial" readings) but it is well accepted that more than two strata are to be found and described.
We will both stress the limits of such analysis and explain how we can still benefit from it.
Fortunately, a large amount of lexical data for Taiwanese Hokkien is available as Open Data. We will propose a method to (semi-) automatically model and explore the Taiwanese lexicon in search of such strata.
Our method is based on the modelisation of the lexicon as a complex network .
We will first introduce all needed theoretical aspects of our modelisation (so that no prior knowledge in graph theory is required to attend this presentation). Then we will explain how we can rely on graph theory to create a model of the lexical data as a graph that takes into account the traditional analysis of 文讀音 and 白讀音. Once the model is created, we will show how it can be analysed and explored in search of lexical strata using community detection and advanced visualisation tools.

Exposé au colloquium du DEC (ENS): Mark Steedman

Mark Steedman, personnalité reconnue en linguistique formelle et en TAL, va présenter des travaux dans le cadre du colloquium du DEC (ENS), mardi prochain 10 mars. L'exposé est intitulé "Statistical Parsing and Interpretation for Jazz Chord Sequences".

Mardi 10 mars, 11h30-13h00, salle Langevin, 29, rue d'Ulm.

The paper describes work with Mark Granroth-Wilding on the problem of parsing and interpreting chord sequences using a grammar of tonal harmony. The grammar is drawn from the same "nearly context-free" class that is strongly adequate for natural languages, and supports a model theory of harmony. Though the grammar is much smaller than language grammars, the degree of ambiguity is much greater, in the sense that any chord can have an unboundedly large number of increasingly preposterous interpretations. Parsing requires the use of a supervised statistical model to limit search, of the same kind that is required for parsing natural language at scale. We evaluate its performance on held-out data against a baseline Markov model, which it outperforms, showing that the grammar is doing useful cognitive work.