浏览全部资源
扫码关注微信
复旦大学 计算机科学技术学院 上海市数据科学重点实验室,上海 201203
[ "王晓阳(1960-),男,上海人,复旦大学教授,主要研究方向为时空移动数据分析、数据系统安全及私密、大数据并行式分析与计算。" ]
[ "郑骁庆(1979-),男,浙江杭州人,复旦大学副教授,主要研究方向为数据集成、自然语言理解、语义万维网。" ]
[ "肖仰华(1980-),男,江苏洪泽人,复旦大学副教授,主要研究方向为数据库、数据挖掘、海量数据处理、图数据库、图数据挖掘。" ]
网络出版日期:2015-12,
纸质出版日期:2015-12-25
移动端阅览
王晓阳, 郑骁庆, 肖仰华. 智慧搜索中的实体与关联关系建模与挖掘[J]. 通信学报, 2015,36(12):178-189.
Sean WANGX, Xiao-qing ZHENG, Yang-hua XIAO. Entity-relation modeling and discovery for smart search[J]. Journal on communications, 2015, 36(12): 178-189.
王晓阳, 郑骁庆, 肖仰华. 智慧搜索中的实体与关联关系建模与挖掘[J]. 通信学报, 2015,36(12):178-189. DOI: 10.11959/j.issn.1000-436x.2015311.
Sean WANGX, Xiao-qing ZHENG, Yang-hua XIAO. Entity-relation modeling and discovery for smart search[J]. Journal on communications, 2015, 36(12): 178-189. DOI: 10.11959/j.issn.1000-436x.2015311.
随着网络搜索空间从互联网扩展到人、机、物互联的泛在网络空间,以及大数据时代的到来,传统的搜索引擎已经不能满足时代的需求,新时代的搜索引擎技术——大搜索(或称智慧搜索)概念应运而生。因此,讨论实现大搜索所需关键技术之一的实体与关联关系建模与挖掘,以及相关的设计思想和实现技术。
Nowadays
by connecting the mobile networks
Internet of Things and the sensor networks to the Internet
the cyberspace has expanded to a ubiquitous space of human beings
machines and things.Combining with the technology of big data
the traditional search engines are evolving into their next generation—big search (or smart search).Entity-relation modeling and discovery are the key techniques to fulfill the vision of smart search.Approaches to model the entities and their relations in large scale by knowledge graph and knowledge warehouse
and ways to discovery new entities and the relations between them in the cyberspace are discussed.
方滨兴 , 等 . Statistical multisource-multitarget information fusion [M ] . 北京 : 电子工业出版社 , 2015 .
FANG B X , et al . Big Search Technology White Paper [M ] . Beijing : Electronic Industry PressPress , 2015 .
ETZIONI O , CAFARELLA M , DOWNEY D , et al . Web-scale information extraction in knowitall:(preliminary results) [A ] . Proceedings of the 13th International Conference on World Wide Web [C ] . ACM , 2004 . 100 - 110 .
YATES A , CAFARELLA M , BANKO M , et al . Textrunner:open information extraction on the web [A ] . Proceedings of Human Language Technologies:The Annual Conference of the North American Chapter of the Association for Computational Linguistics:Demonstrations Association for Computational Linguistics [C ] . 2007 . 25 - 26 .
WU W , LI H , WANG H , et al . Probase:a probabilistic taxonomy for text understanding [A ] . ACM SIGMOD International Conference on Management of Data [C ] . ACM , 2012 . 481 - 492 .
SUCHANEK F M , KASNECI G , WEIKUM G . Yago:a core of semantic knowledge [A ] . 16th International Conference on World Wide Web [C ] . ACM , 2007 . 697 - 706 .
AUER S , BIZER C , KOBILAROV G , et al . Dbpedia:a Nucleus for a Web of Open Data [M ] . Springer Berlin Heidelberg , 2007 .
BOLLACKER K , EVANS C , PARITOSH P , et al . Freebase:a collaboratively created graph database for structuring human knowledge [A ] . ACM SIGMOD International Conference on Management of Data [C ] . ACM , 2008 . 1247 - 1250 .
SINGHAL A . Introducing the Knowledge Graph:Things,Not Strings Official Blog (of Google) [EB/OL ] . http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html.Retrieved http://googleblog.blogspot.com/2012/05/introducing-knowledge-graph-things-not.html.Retrieved .
WANG J , WANG H , WANG Z , et al . Understanding Tables on the Web Conceptual Modeling [M ] . Springer Berlin Heidelberg , 2012 . 141 - 155 .
WANG Y , LI H , WANG H , et al . Toward Topic Search on the Web [R ] . Technical report,Microsoft Research , 2010 .
Apple-Siri-frequently asked questions . Apple [EB/OL ] . http://www.siriuserguide.com/siri-faq/ http://www.siriuserguide.com/siri-faq/ .
HOFFART J , SUCHANEK F M , BERBERICH K , et al . YAGO2:exploring and querying world knowledge in time,space,context,and many languages [A ] . 20th International Conference Companion on World Wide Web [C ] . ACM , 2011 . 229 - 232 .
PASSANT A . Dbrec—music recommendations using DBpedia [A ] . The Semantic Web-ISWC 2010 [C ] . Springer Berlin Heidelberg , 2010 . 209 - 224 .
GARCIA A , SZOMSZOR M , ALANI H , et al . Preliminary results in tag disambiguation using DBpedia [A ] . Collective Knowledge Capturing and Representation [C ] . California , 2009 .
Wu F , Weld D S . Automatically refining the wikipedia infobox ontology [A ] . 17th International Conference on World Wide Web [C ] . ACM , 2008 . 635 - 644 .
KASNECI G , RAMANATH M , SUCHANEK F , et al . The YAGO-NAGA approach to knowledge discovery [J ] . ACM SIGMOD Record , 2009 , 37 ( 4 ): 41 - 47 .
LIN H , JIA Y , WANG Y , et al . Populating knowledge base with collective entity mentions:a graph-based approach [A ] . Advances in Social Networks Analysis and Mining (ASONAM),2014 IEEE/ACM International Conference on [C ] . IEEE , 2014 . 604 - 611 .
JIA Y , WANG Y , CHENG X , et al . OpenKN:an open knowledge computational engine for network big data [A ] . Advances in Social Networks Analysis and Mining (ASONAM),2014 IEEE/ACM International Conference on [C ] . IEEE , 2014 . 657 - 664 .
王元卓 , 贾岩涛 , 赵泽亚 , 等 . OpenKN——网络大数据时代的知识计算引擎 [J ] . CCF通讯 , 2014 , 10 ( 11 ): 30 - 35 .
WANG Y Z , JIA Y T , ZHAO Z Y , et al . OpenKN—— knowledge computing engine in the big data era [J ] . CCF Communication , 2014 , 10 ( 10 ): 30 - 35 .
LI Q , LI Y L , GAO J , et al . Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation [A ] . Proceedings of the 2014 SIGMOD [C ] . 2014 .
SARMA D JAIN A A , YU C . Dynamic relationship and event discovery [A ] . Fourth ACM International Conference on Web Search and Data Mining [C ] . ACM , 2011 . 207 - 216 .
KUZEY E , VREEKEN J , WEIKUM G . A fresh look on knowledge bases:Distilling named events from news [A ] . 23rd ACM International Conference on Information and Knowledge Management [C ] . ACM , 2014 . 1689 - 1698 .
BROEKSTRA J , KAMPMAN A , VAN HARMELEN F . Sesame:an architecture for storing and querying rdf data and schema information [J ] . Spinning the Semantic Web:Bringing the World Wide Web to Its Full Potential , 2003 , 197 .
WILKINSON K , SAYERS C , KUNO H A , et al . Efficient RDF Storage and retrieval in Jena2 [A ] . The First International Workshop on Semantic Web and Databases [C ] . 2003 , 3 : 131 - 150 .
HARRIS S , GIBBINS N . 3store:efficient bulk RDF storage [A ] . Workshop on Practical and Scalable Semantic Systems [C ] . 2003 .
ALEXAKI S , CHRISTOPHIDES V , KARVOUNARAKIS G , et al . The ICS-FORTH RDFSuite:managing voluminous RDF description bases [A ] . SemWeb [C ] . Hong Kong,China , 2001 .
CHONG E I , DAS S , EADON G , et al . An efficient SQL-based RDF querying scheme [A ] . 31st International Conference on Very Large Data Bases VLDB Endowment [C ] . 2005 . 1216 - 1227 .
ABADI D J , MARCUS A , MADDEN S R , et al . Scalable semantic web data management using vertical partitioning [A ] . 33rd International Conference on Very Large Data Bases [C ] . 2007 . 411 - 422 .
WEISS C , KARRAS P , BERNSTEIN A . Hexastore:sextuple indexing for semantic Web data management [J ] . Proceedings of the VLDB Endowment , 2008 , 1 ( 1 ): 1008 - 1019 .
SIDIROURGOS L , GONCALVES R , KERSTEN M , et al . Column-store support for RDF data management:not all swans are white [J ] . Proceedings of the VLDB Endowment , 2008 , 1 ( 2 ): 1553 - 1563 .
ATRE M , CHAOJI V , ZAKI M J , et al . Matrix bit loaded:a scalable lightweight join query processor for RDF data [A ] . 19th International Conference on World Wide Web [C ] . ACM , 2010 . 41 - 50 .
STOCKER M , SEABORNE A , BERNSTEIN A , et al . SPARQL basic graph pattern optimization using selectivity estimation [A ] . 17th International Conference on World Wide Web [C ] . ACM , 2008 . 595 - 604 .
MADUKO A , ANYANWU K , SHETH A , et al . Estimating the cardinality of RDF graph patterns [A ] . Proceedings of the 16th International Conference on World Wide Web [C ] . ACM , 2007 . 1233 - 1234 .
NEUMANN T , WEIKUM G . RDF-3X:a RISC-style engine for RDF [J ] . Proceedings of the VLDB Endowment , 2008 , 1 ( 1 ): 647 - 659 .
NEUMANN T , WEIKUM G . The RDF-3X engine for scalable management of RDF data [J ] . The VLDB Journal , 2010 , 19 ( 1 ): 91 - 113 .
NEUMANN T , WEIKUM G . Scalable join processing on very large RDF graphs [A ] . Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data [C ] . ACM , 2009 . 627 - 640 .
HUANG J , ABADI D J , REN K . Scalable SPARQL querying of large RDF graphs [J ] . Proceedings of the VLDB Endowment , 2011 , 4 ( 11 ): 1123 - 1134 .
BINNA R , GASSLER W , ZANGERLE E , et al . Spiderstore:exploiting main memory for efficient RDF graph representation and fast querying [A ] . Proceedings of Workshop on Semantic Data Management (SemData@ VLDB) [C ] . 2010 .
WEAVER J , HENDLER J A . Parallel Materialization of the Finite RDFs Closure for Hundreds of Millions of Triples [M ] . Springer Berlin Heidelberg , 2009 .
URBANI J , KOTOULAS S , OREN E , et al . Scalable Distributed Reasoning Using MapReduce [M ] . Springer Berlin Heidelberg , 2009 .
MYUNG J , YEON J , LEE S . SPARQL basic graph pattern processing with iterative MapReduce [A ] . Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud [C ] . ACM , 2010 .
ROHLOFF K , SCHANTZ R E . High-performance,massively scalable distributed systems using the MapReduce software framework:the SHARD triple-store [A ] . Programming Support Innovations for Emerging Distributed Applications [C ] . ACM , 2010 .
GUPTA M , GAO J , YAN X F , et al . Top-K interesting subgraph discovery in information networks [A ] . 2014 International Conference on Data Engineering [C ] . 2014 .
ZOU L,ÖZSU M T , CHEN L , et al . gStore:a graph-based SPARQL query engine [J ] . The VLDB Journal—the International Journal on Very Large Data Bases , 2014 , 23 ( 4 ): 565 - 590 .
ZOU L , HUANG R , WANG H , et al . Natural language question answering over RDF:a graph data driven approach [A ] . Proceedings of the 2014 ACM SIGMOD International Conference on Management of data [C ] . ACM , 2014 . 313 - 324 .
YANG T , CHEN J , WANG X , et al . Efficient S`PARQL query evaluation via automatic data partitioning [A ] . Database Systems for Advanced Applications [C ] . Wuhan , 2013 .
DU F , BIAN H , CHEN Y , et al . Efficient SPARQL query evaluation in a database cluster [A ] . Big Data,2013 IEEE International Congress on [C ] . 2013 . 165 - 172 .
BIAN H , CHEN Y , DU X , et al . MetKB:enriching RDF knowledge bases with web entity-attribute tables [A ] . 22nd ACM International Conference on Conference on Information & Knowledge Management [C ] . ACM , 2013 . 2461 - 2464 .
RAIMOND Y , et al . The event ontology [EB/OL ] . http://motools.sourceforge.net/event/event.html http://motools.sourceforge.net/event/event.html . 20073 .
TRAME J,KEßLER C , KUHN W . Linked Data And Time–Modeling Researcher Life Lines By Events [M ] . Spatial Information Theory . Springer International Publishing , 2013 .
JIN R , HONG H , WANG H , et al . Computing label-constraint reachability in graph databases [A ] . 2010 ACM SIGMOD International Conference on Management of data [C ] . ACM , 2010 . 123 - 134 .
XU K , ZOU L , YU J X , et al . Answering label-constraint reachability in large graphs [A ] . Proceedings of the 20th ACM International Conference on Information and Knowledge Management [C ] . ACM , 2011 . 1595 - 1600 .
FAN W , LI J , MA S , et al . Adding regular expressions to graph reachability and pattern queries [A ] . Data Engineering (ICDE),2011 IEEE 27th International Conference on [C ] . 2011 . 39 - 50 .
GUBICHEV A , BEDATHUR S , SEUFERT S , et al . Fast and accurate estimation of shortest paths in large graphs [A ] . Proceedings of the 19th ACM International Conference on Information and Knowledge Management [C ] . ACM , 2010 . 499 - 508 .
POTAMIAS M , BONCHI F , CASTILLO C , et al . Fast shortest path distance estimation in large networks [A ] . 18th ACM Conference on Information and Knowledge Management [C ] . ACM , 2009 . 867 - 876 .
TRETYAKOV K,ARMAS-CERVANTES A,GARCÍA-BAÑUELOS L , et al . Fast fully dynamic landmark-based estimation of shortest path distances in very large graphs [A ] . 20th ACM International Conference on Information and Knowledge Management [C ] . ACM , 2011 . 1785 - 1794 .
DAS SARMA A , GOLLAPUDI S , NAJORK M , et al . A sketch-based distance oracle for Web-scale graphs [A ] . Proceedings of the Third ACM International Conference on Web Search and Data Mining [C ] . ACM , 2010 . 401 - 410 .
GOLDBERG A V , HARRELSON C . Computing the shortest path:a search meets graph theory [A ] . Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms Society for Industrial and Applied Mathematics [C ] . 2005 . 156 - 165 .
ZHAO X , SALA A , WILSON C , et al . Orion:shortest path estimation for large social graphs [J ] . 2010 , 1 : 5 .
ZHAO X , SALA A , ZHENG H , et al . Fast and scalable analysis of massive social graph [J ] . arXiv preprint arXiv:1107.5114 , 2011 .
FAN W , LI J , MA S , et al . Graph pattern matching:from intractable to polynomial time [J ] . Proceedings of the VLDB Endowment , 2010 , 3 ( 1-2 ): 264 - 275 .
ZOU L , CHEN L,ÖZSU M T , et al . Answering pattern match queries in large graph databases via graph embedding [J ] . International Journal on Very Large Data Bases , 2012 , 21 ( 1 ): 97 - 120 .
MA S , CAO Y , FAN W , et al . Capturing topology in graph pattern matching [J ] . Proceedings of the VLDB Endowment , 2011 , 5 ( 4 ): 310 - 321 .
SUN Z , WANG H , WANG H , et al . Efficient subgraph matching on billion node graphs [J ] . Proceedings of the VLDB Endowment , 2012 , 5 ( 9 ): 788 - 799 .
MA S , CAO Y , HUAI J , et al . Distributed graph pattern matching [A ] . 21st International Conference on World Wide Web [C ] . 2012 . 949 - 958 .
LI G , OOI B C , FENG J , et al . EASE:an effective 3-in-1 keyword search method for unstructured,semi-structured and structured data [A ] . ACM SIGMOD International Conference on Management of Data [C ] . 2008 . 903 - 914 .
KARGAR M , et al.A . Keyword search in graphs:finding r-cliques [J ] . Proceedings of the VLDB Endowment , 2011 , 4 ( 10 ): 681 - 692 .
GRAY J , CHAUDHURI S , Bosworth A , et al . Data cube:a relational aggregation operator generalizing group-by,cross-tab,and sub-totals [J ] . Data Mining and Knowledge Discovery , 1997 , 1 ( 1 ): 29 - 53 .
LIN C X , DING B , HAN J , et al . Text cube:computing ir measures for multidimensional text database analysis [A ] . Data Mining,ICDM'08.Eighth IEEE International Conference on [C ] . 2008 . 905 - 910 .
ZHANG D , ZHAI C , HAN J . Topic cube:topic modeling for OLAP on multidimensional text databases [A ] . SDM [C ] . 2009 , 9 : 1124 - 1135 .
CHEN C , YAN X , ZHU F , et al . Graph OLAP:towards online analytical processing on graphs [A ] . Eighth IEEE International Conference on Data Mining [C ] . 2008 .
ZHAO P , LI X , XIN D , et al . Graph cube:on warehousing and OLAP multidimensional networks [A ] . ACM SIGMOD International Conference on Management of data [C ] . 2011 . 853 - 864 .
0
浏览量
1130
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构