Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Open source license (wikipedia)
- Date: Wed, 23 May 2018 10:35:28 +0900
- From: "Stephen J. Turnbull" <turnbull.stephen.fw@example.com>
- Subject: Re: [tlug] Open source license (wikipedia)
- References: <01967dcf-dc9e-0f08-b0d1-7c844db58684@dcook.org> <23293.29039.66965.697994@turnbull.sk.tsukuba.ac.jp> <23b83822-c9c6-5bb3-3cc2-bbbdee83640b@dcook.org>
Darren Cook writes: > Again, people were saying the the impossibility of reconstructing > the original is the key. I am quite sure that is wrong. For example, in a highly optimized C or C++ program you will be unable to reconstruct the original from the compiled stripped executable (loops may get unrolled, dead code eliminated, common subexpressions coalesced, etc, and of course with the symbols stripped you won't be able to reconstruct variable names), but there is no doubt whatsoever that the original copyright on the source code persists in that executable. (If you receive a program as source code, there is an implied license to compile it for your own use, but not to copy or redistribute the executable you compiled.) > Word embeddings [1] that use multiword expressions or n-grams might be a > more interesting grey area when "n" is high enough (because the text for > each embedding is stored). (But I'll hazard a guess that n-grams up to > at least 4 or 5 is going to be okay.) That's not the way this works. It's not the number of words in an n-gram; it's the number of n-grams that matters. Even 1-grams are hazardous if your corpus is the work of an author with idiosyncratic spelling or frequent neologisms (eg, James Joyce). An example is that there is an infosec tweep I enjoy following (thegrugq), and somebody created a markov 'bot (thegrugq_ebooks) trained on a corpus of thegrugq's tweets. I had to look twice to realize that a tweet that looked like the 'bot was actually a third party because of the peculiar not-quite-English syntax. I suspect the third party was under the influence, but it really "looked like" the bot. And that's what matters. Now, the FSF has a "15 line rule": contributions under that length don't need an assignment. But: that 15 lines applies to the *union* of the contibutor's patches, *not* to *individual* patches. So one tweet is probably not enough to infringe the bot's copyright. :-) On the other hand, it's not obvious to me that with a few 1000 tweets by now the 'bot can't infringe thegrugq's copyright.... Steve
- Follow-Ups:
- Re: [tlug] Open source license (wikipedia)
- From: Benjamin Kowarsch
- References:
- [tlug] Open source license (wikipedia)
- From: Darren Cook
- [tlug] Open source license (wikipedia)
- From: Stephen J. Turnbull
- Re: [tlug] Open source license (wikipedia)
- From: Darren Cook
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Open source license (wikipedia)
- Next by Date: Re: [tlug] Open source license (wikipedia)
- Previous by thread: Re: [tlug] Open source license (wikipedia)
- Next by thread: Re: [tlug] Open source license (wikipedia)
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links