lingo.lol is one of the many independent Mastodon servers you can use to participate in the fediverse.
A place for linguists, philologists, and other lovers of languages.

Server stats:

68
active users

people: an engineer/programmer friend has a project coming up for a class, and they asked if there’s anything they could help with for my diss since they have no good project ideas. They could program something to help with my corpus stuff, turn that in, maybe get an article with me out of it.

Any… any good ideas on what could be generally useful for (ES) ? I can try and come up with something just for me, but it’d be cool if it’s useful for the field at large too.

The project is apparently specifically on large data management, so corpus seems like a great topic, but… what do we do. Help.

Ártemis López

@grvsmth I've only used one program, so I'm not super sure of how much conversion is needed to switch from one to another. Wouldn't it be a fairly straight-forward find-and-replace RegEx? Like how translation tools will sometimes have <1>, or {1}, or a couple of things like that. 🤔

@queerterpreter Even if it were, something that collects all the regexes for each system would be helpful!

But it's actually more complex. There are inline and offset annotation systems, for one thing!

@grvsmth Hmmm, this could be interesting to her! Do you know if there's a good starting point I could direct her (it's HER final project, after all) to start looking into the different annotations out there?

@queerterpreter ahahahaha! Yes, I know people have compiled a list of annotation systems. Several lists, in fact! I've probably bookmarked some of them on my other computer, and there are some in my email inbox. Maybe there's a list of lists of annotation systems. This Wikipedia article is probably a good place to start!

en.wikipedia.org/wiki/Text_ann

en.wikipedia.orgText annotation - Wikipedia