Beyond word frequency: Bursts, lulls, and scaling in the temporal distributions of words - Computer Science > Computation and LanguageReportar como inadecuado




Beyond word frequency: Bursts, lulls, and scaling in the temporal distributions of words - Computer Science > Computation and Language - Descarga este documento en PDF. Documentación en PDF para descargar gratis. Disponible también para leer online.

Abstract: Background: Zipf-s discovery that word frequency distributions obey a powerlaw established parallels between biological and physical processes, andlanguage, laying the groundwork for a complex systems perspective on humancommunication. More recent research has also identified scaling regularities inthe dynamics underlying the successive occurrences of events, suggesting thepossibility of similar findings for language as well.Methodology-Principal Findings: By considering frequent words in USENETdiscussion groups and in disparate databases where the language has differentlevels of formality, here we show that the distributions of distances betweensuccessive occurrences of the same word display bursty deviations from aPoisson process and are well characterized by a stretched exponential Weibullscaling. The extent of this deviation depends strongly on semantic type - ameasure of the logicality of each word - and less strongly on frequency. Wedevelop a generative model of this behavior that fully determines the dynamicsof word usage.Conclusions-Significance: Recurrence patterns of words are well described bya stretched exponential distribution of recurrence times, an empirical scalingthat cannot be anticipated from Zipf-s law. Because the use of words provides auniquely precise and powerful lens on human thought and activity, our findingsalso have implications for other overt manifestations of collective humandynamics.



Autor: Eduardo G. Altmann, Janet B. Pierrehumbert, Adilson E. Motter

Fuente: https://arxiv.org/







Documentos relacionados