MappingBooks Jump in the book!
Dan Cristea1, 2, Ionuț Cris2an Pistol 1
1 “Alexandru Ioan Cuza” University of Iași
Faculty of Computer Science
2 Romanian Academy, the Iași branch
Ins2tute for Computer Science {dcristea, ipistol}@info.uaic.ro
The idea:
I like to read books and to travel…
I need help to remember all
kinship
rela2ons
between
characters
Exploita2on of textual informa2on
MappingBooks – a project proposal
• A MappedBook is a book connected with
loca2ons/events in the virtual and real world and sensi2ve to the instantaneous loca2on (as seized by the mobile/tablet) of a reader.
• The informa2on made available could possibly be different depending on the moment and
the place of the reader.
Aims
1) connect en22es’ men2ons in the form of
nominals (noun phrases) => one coreferen2al chain corresponds to each en2ty;
2) no preliminary records about linked en22es =>
the knowledge base evolves from scratch;
3) look specially for coreferen2al (iden2ty of en2ty men2ons) and geographical rela2ons (posi2on, distance, point-‐of, near, intersects, etc.);
4) texts under inves2ga2on: Geography manuals and traveling guides
En2ty linking
• Challenges in en2ty linking:
– name varia2ons – ambigui2es
– absence
• en2ty
• link type
En22es
• Type PERSON
• Type LOCATION
• Type ORGANISATION
Textual realisa2on of en22es
• Syntac2c realisa2on: NPs (proper nouns, common nouns, adjec2ves, complement PPs; but NO rela2ve clauses)
• Characterised by dis2nc2ve heads
– [casa de pe [munte]]
• If intersected ! imbricated
– [Muzeul [Grigore An7pa]]
TA = Text Analytics
NER = Name Entity Recognition EC = Entity Crowling
RD = Relations Detection GEO = Geography
M&T = Maps and Trajectories
AR = Augmented Reality DEV = Device Info
INT = Interfaces RES = Resources
M&E = Management and Evaluation
Features we want to have
• The capacity to see a text different than a string of leeers
– sentence splifng – tokenisa2on
– POS-‐tagging – lemma2sa2on – NP chunking
– anaphora resolu2on
Features we want to have
• Know who’s who
– recognise names and types – disambiguate names
– recognise an en2ty in the text even if men2oned with a common noun or a pronoun
– use an ontology of types
NAME ENTITY RECOGNITION
Features we want to have
• What virtual world en22es are men2oned in the book
– link textual men2ons of en22es in the virtual world
– decide what info from virtual would be relevant to user – use mul2ple sources
ENTITY CROWLING
Features we want to have
• Trace on Google Maps a spa2al rela2on described in the book
– spa2al rela2ons detec2on in text
– use Google Maps APIs or related free technologies – trace loca2ons and paths on maps
RELATIONS DETECTION MAPS&TRAJECTORIES
Features we want to have
• Fetch, process and make use of geo-‐data
– Geographic Informa2on Systems (GIS) – geographic layers
GEOGRAPHY
Features we want to have
• Know where I am
• What real world en22es are in my proximity
– detec2on of my posi2on
– computa2on of distances from the men2oned places – signalling “interes2ng” loca2ons in proximity
DEVICE INFO
Features we want to have
• Mix images with generated info
– process images => segment, contours, recogni2on – sense orienta2on of the camera
– decide info to be displayed
AUGMENTED REALITY
Features we want to have
• Aerac2ve user interfaces
– analyse use cases
– design dedicated user interfaces
– accommodate on the screen a segment of text, a map, user’s posi2on, web info, etc.
INTERFACES
Features we want to have
• Client-‐server
– user’s Portrait – the databases
– standards and communica2on protocols
CLIENT-‐SERVER
Other issues…
• RESOURCES
– find the texts – clear IPR
– perform annota2on
– find other relevant linguis2c data
• MANAGEMENT AND EVALUATION
– establish evalua2on criteria and metrics – monitor the evolu2on of the project
– report dangers
– perform final evalua2on
MappingBooks – Conclusions
• Mul2-‐dimensional mash-‐ups combining textual, geographical and temporal data
• Spot the book men2ons (persons and loca2ons)
• Make heavy use of en2ty linking techniques =>
connec2ng en2ty men2ons onto the virtual world
• Links sensi2ve to:
– the context of men2ons in the book – the current loca2on of the user
– the moment the user ini2ates an access
Acknowledgements
• MappingBools is a project supported by a grant of CNCSIS, Romanian Ministry of
Educa2on and Research, July 2014 – June
2016, in a consor2um with SIVECO Bucharest and “Ștefan cel Mare” University of Suceava
• Thanks to our students in Computer Science, for developing a prototype of the system
during their project in AI, in the Autumn – Winter term of 2013-‐2014…