|
|
Log in / Subscribe / Register

Zacchiroli: all Debian source are belong to us

Stefano Zacchiroli has announced the sources.debian.net (sources.d.n) web site, which hosts the source code for Debian packages. "Via sources.d.n you can therefore browse the content of Debian source packages with usual code viewing features like syntax highlighting. More interestingly, you can search through the source code (of unstable only, though) via integration with http://codesearch.debian.net. You can also use sources.d.n programmatically to query available versions or link to specific lines, with the possibility of adding contextual pop-up messages (example)."

to post comments

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 4:34 UTC (Wed) by shmerl (guest, #65921) [Link] (9 responses)

I don't see syntax highlighting there for some reason.

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 5:01 UTC (Wed) by Wummel (guest, #7591) [Link] (8 responses)

Works for me: Wrap.pm.diff. I suspect syntax highlighting is activated on file extensions, not on content inspection since Makefiles named "debian/rules" are not highlighted.

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 6:42 UTC (Wed) by zack (subscriber, #7062) [Link] (7 responses)

That's correct: syntax highlighting is activated based on file extensions.

Whereas the client-side highlighting engine we use (http://softwaremaniacs.org/soft/highlight/en/) does support language auto-detection, in our experiments it wasn't reliable enough to be used on such a large code base. The number of false-positives in language detection was too high to be worth. We therefore got back to extension-based language detection (which has false-positives too, of course, but it seems to be way better than automatic detection).

If people have alternative client-side syntax highlighting engines to suggest, which are provably better on such a large code base, we're open to suggestions (and patches ;-))

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 7:37 UTC (Wed) by kugel (subscriber, #70540) [Link] (6 responses)

You may at least parse the first line (or two) in addition, so that #! is recognized, vim modelines would be a bonus.

Both techniques are used by Geany to detect the filetype additionally to the extensions (in fact they override the extension) and it works awesome.

cowsay, the primary example used in the examples page, is _not_ highlighted.

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 8:36 UTC (Thu) by zack (subscriber, #7062) [Link] (5 responses)

> You may at least parse the first line (or two) in addition, so that #! is recognized, vim modelines would be a bonus.

Right. Thanks for the suggestion. As we're swamped with feature requests right now (which is a good thing, I guess :-)), I've for the moment added it to the public TODO list.

> Both techniques are used by Geany to detect the filetype additionally to the extensions (in fact they override the extension) and it works awesome.

In Geany do you have (or use) a db mapping shebang lines to languages that can be reused? We can obviously hardcode that in Debsources, but it sounds like knowledge that can be easily factorizes. In particular, I'm thinking/worrying here about all the variants of Python shebang lines.

If you've something reusable, please shout!

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 18:10 UTC (Thu) by kugel (subscriber, #70540) [Link] (4 responses)

You can see the code here: http://sources.debian.net/src/geany/1.23.1%2Bdfsg-1/src/f... , as you can see it too parses <html, <?xml, <?php (for embedded scripts) as a bonus.

It has the mapping table, you have to decide in what way to reuse it :)

Heh, I was about to point you to it via github, but then I found using the service in this topic would be much better :)

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 20:56 UTC (Thu) by zack (subscriber, #7062) [Link] (3 responses)

> Heh, I was about to point you to it via github, but then I found using the service in this topic would be much better :)

Indeed, best meta-bug-report ever ;-)
Thanks for the pointer, we'll do the same approach for debsources.

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 21:55 UTC (Thu) by kugel (subscriber, #70540) [Link]

For clarification (I myself misunderstood the code), it parses the first line(s) for <html, <?xml and <?php in case these tags appear in files that do not and with .html, .xml or .php respectively (like html embedded into perl code). If the files containing those tags end with a hardcoded extension the extension takes precedence (find_shebang() returns NULL), otherwise the file type is set to e.g. XML even if the file extension was not .xml.

Even if that's much smarts for you, detecting shebang alone helps a lot already :)

Glad I could help Debian :)

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 22:13 UTC (Thu) by kugel (subscriber, #70540) [Link] (1 responses)

FWIW, Geany's default extension-to-filetype mapping might be useful for you too: http://sources.debian.net/src/geany/1.23.1%2Bdfsg-1/data/...

Zacchiroli: all Debian source are belong to us

Posted Jul 15, 2013 15:49 UTC (Mon) by zack (subscriber, #7062) [Link]

Heya, sources.debian.net now uses the same strategy (extension + shebang override) and the same mapping (modulo those languages not supported by highlight.js) of Geany.

Thanks for the tip!

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 6:37 UTC (Wed) by eru (subscriber, #2753) [Link]

Great! Since Debian contains almost every interesting Free software package, this is a fantastic one-stop site for browsing and searching them, also for those not using Debian.

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 10:34 UTC (Wed) by andka (guest, #974) [Link] (5 responses)

For anyone who thinks the subject line has some syntactic problems:

http://en.wikipedia.org/wiki/All_your_base_are_belong_to_us

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 16:38 UTC (Wed) by tjc (guest, #137) [Link] (3 responses)

It does have a syntax problem. All natural languages tend to decay over time, and the rate of decay increases as large groups of people begin to misuse them.

Zacchiroli: all Debian source are belong to us

Posted Jul 4, 2013 12:40 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (2 responses)

All natural languages change over time; that's why this discussion is occurring in Modern English instead of Proto-Indo-European.

People who dislike particular changes (or who just want the language to be preserved unchanged forever; those people should speak Latin instead and leave our beautiful living languages alone) are of course free to rail against those changes, calling them "misuse" or "decay".

Besides, riffing on a widely-disseminated pop-cultural reference is not "misuse" in the first place, even when the base material is an almost stereotypically awful Japanese-to-English translation.

Zacchiroli: all Debian source are belong to us

Posted Jul 5, 2013 5:10 UTC (Fri) by tjc (guest, #137) [Link] (1 responses)

> People who dislike particular changes ... are of course free to rail against those changes, calling them "misuse" or "decay".

It's not unreasonable to refer to the use of the wrong verb tense as misuse.

Zacchiroli: all Debian source are belong to us

Posted Jul 5, 2013 13:39 UTC (Fri) by man_ls (guest, #15091) [Link]

It depends entirely on the context. The wrong verb tense for one group can be the correct one for another. Some groups regularize certain features which for others should be irregular, and viceversa. Given some time, foul pseudo-dialects can grow to be full blown languages on their own.

Zacchiroli: all Debian source are belong to us

Posted Jul 3, 2013 21:41 UTC (Wed) by geuder (subscriber, #62854) [Link]

Thanks for the pointer. I already wondered what that means. Native speakers should be aware of the concept of offshore English http://www.usingenglish.com/articles/what-offshore-englis..., not only if they want to sell something but also when discussing software.

I have found myself preferring to co-operate with non-native speakers in open source projects, because native speakers might just dominate others using their language skills. Not that it happens frequently, but it exists.

Well, there is always the balance whether the non-native learns something new or stays in the darkness. In this case the end result was eventually the better one.

sources.debian.org

Posted Jul 3, 2013 10:52 UTC (Wed) by DonDiego (guest, #24141) [Link] (2 responses)

Possibly silly question: Why isn't this hosted under debian.org?

sources.debian.org

Posted Jul 3, 2013 14:10 UTC (Wed) by kaeso (guest, #49701) [Link]

debian.net is a second-level domain accessible to each developer, mostly for experiments and early proof-of-concepts with self-hosted services, whereas debian.org is the hierarchy used for official machines/services and centrally administered by the Debian sysadmin team.

For an (incomplete) list of such experimental services, see http://wiki.debian.org/DebianNetDomains

If/when a service becomes stable and accepted it is usually moved to the .org space. In this case, check http://wiki.debian.org/source.debian.org for the plan.

sources.debian.org

Posted Jul 4, 2013 8:39 UTC (Thu) by zack (subscriber, #7062) [Link]

> Possibly silly question: Why isn't this hosted under debian.org?

@kaeso's answer is spot-on on this.

I'd like just to add that you should think of debian.net as a "staging area" for new official Debian services. Once services have been tested and polished enough, their maintainers usually take the step to integrate them in the official debian.org infrastructure, administered by Debian sysadms. I definitely intend to do that in the medium term for sources.d.n, but before that we have quite some TODO items to smash. The service is definitely still in its infancy.

Very useful

Posted Jul 3, 2013 13:11 UTC (Wed) by zdzichu (subscriber, #17118) [Link]

This is really useful! Until now, to check how Debian package software I had to download tarballs fished out from packages.debian.org, unpack them and analyze. Now all this data is on a silver platter.
Thank you very much, it will certainly help in reducing cross-distro differences.


Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds