Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Open source license (wikipedia)
- Date: Thu, 17 May 2018 17:49:13 +0100
- From: Darren Cook <darren@example.com>
- Subject: Re: [tlug] Open source license (wikipedia)
- References: <01967dcf-dc9e-0f08-b0d1-7c844db58684@dcook.org> <23293.29039.66965.697994@turnbull.sk.tsukuba.ac.jp>
- User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0
> > My real question, of course, is can I train a machine learning > > model on that text data, and release it under a more liberal > > license? Assuming the model is effectively a one-way hash, and > > cannot reproduce the original data. > > It really depends on exactly what the model does. I was lucky enough to be at an NLP conference last week where I asked some people this same question, and got confident replies that what I want to do is fine. Again, people were saying the the impossibility of reconstructing the original is the key. > > This is your litmus test. Can you reliably reconstruct the original > > text? If so, it is a derivative work. If not, then it isn't. > > That's in the ballpark, but I'm pretty sure that's not the litmus > test. The test is the reverse, ie, more like "if you know the > original content, can you recognize something that probably has copied > the expression of it?" The models I have in mind pass that test too. Word embeddings [1] that use multiword expressions or n-grams might be a more interesting grey area when "n" is high enough (because the text for each embedding is stored). (But I'll hazard a guess that n-grams up to at least 4 or 5 is going to be okay.) ...oh, just realized, 1-way hashing of the text will still allow the embeddings to work, and then it passes your other test too. Darren [1]: https://en.wikipedia.org/wiki/Word_embedding
- Follow-Ups:
- Re: [tlug] Open source license (wikipedia)
- From: Stephen J. Turnbull
- References:
- [tlug] Open source license (wikipedia)
- From: Darren Cook
- [tlug] Open source license (wikipedia)
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Prev by Date: [tlug] Open source license (wikipedia)
- Next by Date: Re: [tlug] Open source license (wikipedia)
- Previous by thread: [tlug] Open source license (wikipedia)
- Next by thread: Re: [tlug] Open source license (wikipedia)
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links