KDE.News takes a
look at Nepomuk. "The KDE team working on Nepomuk aims to bring
the Semantic Desktop to KDE 4, allowing applications to share and respond
intelligently to meta data about files, contacts, web pages and more. Let
us make this short: Nepomuk is an important project for the future KDE
desktop. Its goal is to get all the information available on the system to
the user. You are receiving an email - Nepomuk should show you information
relevant to related projects or persons or tasks. You look at images of a
person - Nepomuk should have links to other images of that person or
unanswered emails or events you met that person at. You open the video
player - Nepomuk should propose to watch the next episode in the series you
are currently watching."
(Log in to post comments)
The Semantic Desktop Wants You (KDEDot)
Posted Oct 23, 2009 19:55 UTC (Fri) by bronson (subscriber, #4806)
[Link]
This sounds like another one of these pie-in-the-sky projects like Tracker that promise the world but never manage to produce a usable product [*].
And Nepomuk has a larger and more poorly defined goal than Tracker! "Its goal is to get all the information available on the system to the user." Really? OK, good luck with that!
Two thoughts...
1. Please keep my private data private? I don't want to be reluctant to store my tax returns on my home computer.
2. Beware the paperclip. "I see you're trying to watch a video! Can I help you with that?"
So, I'm skeptical, but it should be fun to watch.
[*] In fairness, Tracker is still in active development and could still live up to its promises one day. Also, I realize that the Tracker and Nepomuk projects have been collaborating well.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 23, 2009 21:23 UTC (Fri) by drag (subscriber, #31333)
[Link]
I use tracker as a indexer for my desktop for a couple years now. It is
usable and efficient at what it does.
What is left is to produce software that can do something more useful with
the information it creates.
it works...
I alias tracker-search to sch
$ sch vim
Defaulting to 'files' service
Results: 16
/home/drag/Repo/stuff/vim/plugin/taglist.vim
/home/drag/Repo/stuff/vim/plugin/minibufexpl.vim
/home/drag/Repo/stuff/vim/plugin/bufkill.vim
/home/drag/Repo/stuff/vim/plugin/supertab.vim
etc etc.
Without it the best comparable would be something like:
find /home/drag -type f -print0 |xargs -0 grep vim
Of course tracker takes a fraction of a second to return results versus a
10-20 minutes with find/xargs/grep
And it is useful for more then just keeping track of the first 2500 words
of contents of files...
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 18:13 UTC (Sat) by SEMW (guest, #52697)
[Link]
> Without [tracker] the best comparable would be something like: find /home/drag -type f -print0 |xargs -0 grep vim
Surely the relevant comparison is not between tracker and 'find', but tracker and 'locate'? Given that locate and find are both indexed searches, and find is not.
Indeed, I switched from tracker to locate as my primary indexed search tool a while ago, after being annoyed by (1) Ubuntu bug #163544, (2) lack of a good tracker plugin to gnome-do back then, (3) trackerd seemed to slow the system down in a more constant way than a middle-of-the-night updatedb, and (4) locate searches the whole filesystem (not just the home directory) by default and is still pretty fast; tracker gets quite a bit slower if you ask it do do the same. I don't know how many of these are now better (probably most of them, though (1) is still not fixed), but locate works fine for what I use it for, so I've not much reason to switch.
(BTW, surely you can just use "-name *vim*" rather than fiddling about with null-seperated-pipes-to-grep?)
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 22:41 UTC (Sat) by drag (subscriber, #31333)
[Link]
> Surely the relevant comparison is not between tracker and 'find', but
tracker and 'locate'? Given that locate and find are both indexed searches,
and find is not.
Well locate only does its search on file names, not contents. So that is
the major difference. Grep/find and tracker do it on content, not file
names. Of course tracker can do search on metadata, also, which grep may
have a hard time based on the file format.
The tracker-search I did found a lot more then just files with "vim" in
their filenames, which I neglected to include in my sample output.
------------
I have found that tracker is a good indexer that can run on slow and low-
resource machines. How much memory you allocate to it and such can be
configurable to a certain extent, but it will consume more resources if you
have a high rate of file system churn.
So in the past I would have problems with it and bittorrent, for example.
It uses inotify to respond to file system events in real-time but with that
sort of workload it would spaz out and re-index the same large binary files
over and over again. Same thing with my "download" directory.
So you just use it's configuration utility to exclude those directories.
But I think it is much less of a problem then it used to be.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 17:24 UTC (Sun) by SEMW (guest, #52697)
[Link]
> Well locate only does its search on file names, not contents.
Ah, of course; sorry, I should have remembered.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 19:17 UTC (Mon) by cmccabe (subscriber, #60281)
[Link]
> Indeed, I switched from tracker to locate as my primary indexed
> search tool a while ago
I also use slocate instead of trackerd.
I think that updating the index in a cron job makes a lot more sense than updating it immediately. Just because I downloaded a PDF from some website doesn't mean I want it added to the index Right Now. (Which by the way, will load my CPU at 100% for a minute or two.)
C.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 2:31 UTC (Sat) by dkite (guest, #4577)
[Link]
It will be as useful as it is invisible.
When applications take advantage of the api to index their data, do the
connections and tagging, and use the connections in helpful ways, it will be
useful.
The chromium browser uses the google data store to respond when you type
into the address bar. Very useful, but all the technology is hidden.
If Nepomuk is used in the same way, it will become indispensable.
Derek
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 22:09 UTC (Sat) by sebas (subscriber, #51660)
[Link]
> 1. Please keep my private data private? I don't want to be reluctant to
store my tax returns on my home computer.
The index is stored on your local filesystem, in your home directory and
specific per user, it's not exposed to the network (which has the obvious
downside that it's hard to share meta information across different
machines). You can also specify which directories to index for file
searching.
> 2. Beware the paperclip. "I see you're trying to watch a video! Can I
help you with that?"
The idea is more to integrate it into applications, so you can easily get
at files you're working at using a specific project, or things you've
edited, or read lately. Another result will be an implementation of
contacts on your desktop. These will be defined in standardized way, using
for example the PIMO ontology, a benefit here is that applications can
more easily use the same concepts, for different goals and have their data
shared. Things like Recent Documents become more reliable, and more
powerful, tagging and rating of files becomes universal across your
desktop and applications. I think this kind of tangible benefits is often
missed when talking about The Grand Vision behind Nepomuk.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 3:36 UTC (Sun) by mpokrywka (subscriber, #43229)
[Link]
>> 1. Please keep my private data private? I don't want to be reluctant to
store my tax returns on my home computer.
I am also rather concerned what leaves my desktop, i.e. when I tag one of my trip photo with one of well known hotel brand, I would not want to have:
"You look at images of a person - Nepomuk should have links to other images of that person" based on internet image search results.
Maybe I am paranoid/old fashioned, but I don't feel the urge to feed Google/Bing/Yahoo with snippets of all my life/creation.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 13:27 UTC (Sun) by Kit (guest, #55925)
[Link]
Nepomuk wouldn't be doing random Google searches based on tags... even ignoring privacy issues, that wouldn't produce useful results without a rather clear understanding of what the tags meant.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 23, 2009 20:53 UTC (Fri) by petegn (guest, #847)
[Link]
This sounds very bad for Linux we do not need some nosey piece of software crusing the system looking thru all your files making a note of just what is where easy meat for the FED's ect not a good idea at all we must have the choice of enabling or disabling it COMPLETELY ie completely removed from the system and that choice needs to be loud and up front with a FULL explanation of exactly what it does where it looks the sort of information it stores and Where it stores that information so that it can be completely removed from the system without trace and by that i mean full forensic traces ie absolutley no chance at all of anyone ever being able to find it .
The Semantic Desktop Wants You (KDEDot)
Posted Oct 23, 2009 21:28 UTC (Fri) by drag (subscriber, #31333)
[Link]
If your worried about the feds searching through your home directory and
finding incriminating information then not using indexing software is about
the shittiest way to go about protecting yourself.
Modern desktop OSes leak information like f-ing crazy. There is no way around
it. This software just uses that to your advantage. Not using it is not going
to hinder any sort of experienced investigator in any way.
If you want to protect yourself in a sane and actually useful manner then
encrypt your harddrive using LUKS and shut it down when your not using it.
Then use encryption to protect information that is especially sacred so that
if a attacker gains access to your computer during runtime then they will not
likely be able to access it.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 16:05 UTC (Sat) by Janne (guest, #40891)
[Link]
"This sounds very bad for Linux"
Quite the opposite, it sounds very good for Linux. BEsides, KDE runs on other systems besides
Linux...
"we do not need some nosey piece of software crusing the system looking thru all your files
making a note of just what is where"
Who is this "we" you talk about? What makes you think that you are in a position to talk about
other people besides yourself?
I use both Linux and OS X. And on OS X I have Spotlight which indexes my HD and makes my
contents easy to find. And I at least find that very very useful. Nepomuk seems to take that idea
one step further. This is about making it easier for you to use your computer. If you think that
this is some kind of tool for authorities to snoop on you, then I think you are way off-base.
"easy meat for the FED's ect not a good idea at all we must have the choice of enabling or
disabling it COMPLETELY ie completely removed from the system and that choice needs to be
loud and up front with a FULL explanation of exactly what it does where it looks the sort of
information it stores and Where it stores that information so that it can be completely removed
from the system without trace and by that i mean full forensic traces ie absolutley no chance at
all of anyone ever being able to find it ."
Have you thought about wrapping your computer in tin-foil? That should also help. Seriously:
there's really no need for paranoia here. And hey, if you are so paranoid, why are you online? Do
you have any idea how much information can be gathered from you based on your net-use?
And have you heard of these things called "periods"? They are thing in your keyboard that looks
like this: . Have you considered using them in your text?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 21:20 UTC (Sat) by jospoortvliet (subscriber, #33164)
[Link]
At least some ppl understand what we're trying to do here ;-)
The Semantic Desktop Wants You (KDEDot)
Posted Oct 23, 2009 22:43 UTC (Fri) by horen (subscriber, #2514)
[Link]
As much as I like KDE/3 (it's been my desktop-of-choice for the past five years, and I do.not.like KDE/4 at.all), this is yet another reason for me to move to LXDE.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 14:53 UTC (Sat) by felixrabe (guest, #50514)
[Link]
Feel free and enjoy.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 16:14 UTC (Sat) by Janne (guest, #40891)
[Link]
Making the desktop smarter and easier to use is yet another reason to not use that software? Um,
OK. Or is this about performance? I could see performance being an issue.... If you use over 5 year
old computer that is. If you use a computer that is at least remotely modern, I really cen't see how
performance could be an issue.
I find it surprising that some people have powerful computers (and just about all computers are
powerful these days), yet they insist on leaving all that power unused and use software which does
not take advantage of that power in any shape or form....
If you are stuck on Pentium II or something then go ahead and run something like LXDE. Their
system-requirements seem to mirror Windows 98. So if your computer is 10 years old, it might be
smart to not run KDE. But have you ever consider getting a faster computer? Seriously?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 22:28 UTC (Sat) by ballombe (subscriber, #9523)
[Link]
> I find it surprising that some people have powerful computers (and just about all computers are powerful these days), yet they insist on leaving all that power unused and use software which does not take advantage of that power in any shape or form....
Because we use the computer power and memory for real work ? Anything that increase latency slowdown real work.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 22:49 UTC (Sat) by drag (subscriber, #31333)
[Link]
My time is more sensitive then my computer's time. It remains idle 99% of
the time while I have other computers that do more of the "grunt" work. (I
like to use my netbook for a desktop while I have other more powerful
computers do things like compiling or running VMs)
So if a desktop search and file management system can save me 10-15 minutes
of farting around looking for a nearly-forgetten file or some old notes of
mine then that is worth it spending 2-3 hours indexing a system... and it
very rarely takes that much time.
Have you ever found yourself google'ng for a PDF or some documentation you
already downloaded and put somewhere? Is that not a problem?
Especially with modern operating systems like Linux that are setup for
multitasking and with a low-priority indexing job running in the background
the extra load should be unnoticeable. This sort of thing is why we run
Linux.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 3:58 UTC (Sun) by dlang (✭ supporter ✭, #313)
[Link]
if it saves you 15 min every 6 months when you need to do this sort of obscure search, but it adds an extra 1/10 of a second for something you do every 10 min it breaks even on how much of your time it takes ( 1/10 * 6/hour * 8 hours/day * 180 days = 864 seconds vs ~900 seconds)
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 10:53 UTC (Sun) by Janne (guest, #40891)
[Link]
Well, Spotlight on OS X saves me several minutes every day. It finds my emails, apps, files,
attachments... everything. I use it all the time. Whereas I was previously manually looking for
stuff in my computer, all that is now automated. Does it make the system slower? I haven't really
noticed. Even if it consumed 10% of my CPU-power (it doesn't) why would it matter since most of
the time my CPU is more or less idle? What benefit do I get from having my CPU run idle in the
background? I would much rather use that CPU-power to make me productive. And Spotlight
does exactly that. And Nepomuk would do that as well.
You seem to think that people are constantly using 100% of available CPU-power, and therefore
the amount of processes and daemons needs to be as little as possible. But reality is something
totally different. CPU is not the bottleneck, the bottleneck is the user. The amount of data in our
computers has increased exponentally, while users capability to handle that data has NOT
increased at all. Nepomuk, Spotlight and other tools like that make managing that data easier,
and it makes the user more productive.
We should optimize the weakest link in the workflow. And the weakest link is the user. By a wide
margin.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 16:42 UTC (Sun) by Tara_Li (subscriber, #26706)
[Link]
I disagree about the PEBCAK view of where my slowdowns are. Right now, I have a hundred or so tabs open in Firefox, a dozen in Seamonkey, and several torrents seeding on Azureus. My CPU is at 50% utilization of 1GHz, but it's often stuttering on me as I type into this text field - I'm getting spurts of being 10-20 characters ahead of the display.
Considering it's a 3GHz dual-core AMD, I wouldn't expect to see any stuttering - but there you are. I get spurts up to 3GHz, and then right back down to 1GHz. So there's something in the scheduler, be it the OS scheduler (Ubuntu 9.04), or Firefox's scheduler. I don't really care which, I just know that it happens.
When I'm downloading a number of files via Azureus - I know the system often stutters as a whole, and when it does, XOSview shows a *LOT* of WIO time. The files are stored on a software RAID-5 of 5 SATA drives. But the hard drives certainly seem to be a bottleneck. And you want an indexing daemon hitting those hard drives almost constantly????? Geeeez! They're slow enough already, and I don't hear of anything on the horizon that's going to significantly speed them up!
I run SecondLife on a regular basis. There, the slowups are in the video card mostly, but the hard drives take regular slams as well. While SL is running, my CPU is pegged to 3GHz most of the time, and it's noticeable.
And there's another thing - this file indexing is going to cause a *HUGE* non-locality of data. It's going to *KILL* the IO caching, and as for memory caching - just how much time does the CPU waste waiting for data from main memory?
Powerful computers - yes, compared to my old TRS-80 Model I. But then, we're doing so much more with them, even on a lightly loaded machine. Mine is currently watching for USB devices, I have no idea what avahi is... and a huge number of other things Gnome seems to consider as basic functionality...
Ok, now I need to go find out why I have Evolution data server and evolution exchange storage running... I don't use Evolution as far as I know...
Why is running a modern Linux starting to look so much like running Windows? Not just the GUI, but when you dig down underneath?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 18:18 UTC (Sun) by Kit (guest, #55925)
[Link]
>and several torrents seeding on Azureus
I used to use Azureus... I ditched it when I realized that merely having it open would SLAUGHTER my system's resources... and an Azureus instance running on another machine in the same lan downloading/seeding would SLAUGHTER the internet connection of every other machine on the lan, even if the network activity was supposedly low. Merely shutting it down would take all the other machines from struggling to open *any* web page to being able to navigate completely normally.
Azureus is probably a large part of the reason I avoid java (at least on the desktop) like the plague. Maybe its gotten better over the last couple years... but I highly doubt it, and I hightly suspect that a lot of your slowdown is related to it.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 18:43 UTC (Sun) by drag (subscriber, #31333)
[Link]
transmission is able to do what I want...
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 19:58 UTC (Sun) by nix (subscriber, #2304)
[Link]
Weird. I've done massive seeds (multiple Linux distros at once, sort of
thing) using Azureus on a PIII with 320Mb of RAM that was *also* running a
pile of daemons and a firewall in UML. It didn't fly but it wasn't slow.
And it only slaughters the network if you don't throttle it (of course I
throttled the hell out of it: 10Kb/s each way, thankyouverymuch. I'd
probably give it 50Kb/s each way these days, more bandwidth.)
It eats a lot of CPU if you leave it on the wrong tabs, but the summary
tab is CPU-cheapish.
(Maybe it gets slow if the available bandwidth is large...)
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 18:35 UTC (Sun) by drag (subscriber, #31333)
[Link]
I disagree about the PEBCAK view of where my slowdowns are. Right now, I
have a hundred or so tabs open in Firefox, a dozen in Seamonkey, and
several torrents seeding on Azureus. My CPU is at 50% utilization of 1GHz,
but it's often stuttering on me as I type into this text field - I'm
getting spurts of being 10-20 characters ahead of the
display.
Try using Chrome. The Linux scheduler is a lot better at managing multiple
web page renderings then Firefox is. Also Chrome is just mutch faster..
on my netbook scrolling up and down on a busy page in firefox is a painful
affair, but
using Chrome is smooth. The only problem is flashblock on chrome is not
really that good yet and it tends to munge comments in LWN. Unfortunately
firefox has turned into something of a pig. :/ Also you may want to look at
your video card...
When I'm downloading a number of files via Azureus - I know the system
often stutters as a whole, and when it does, XOSview shows a *LOT* of WIO
time. The files are stored on a software RAID-5 of 5 SATA drives. But the
hard drives certainly seem to be a bottleneck. And you want an indexing
daemon hitting those hard drives almost constantly????? Geeeez! They're
slow enough already, and I don't hear of anything on the horizon that's
going to significantly speed them up!
As long as you give the indexer much lower priority then the indexer then
you should be fine. You also filter out directories were your saving large
temporary files and such and you can avoid most of the overhead and
performance problems. The indexer should be idle most of the time and only
indexes files that change or are new; Such is the magic of inotify. So
during gaming it should not be a problem unless the game generates lots of
files you are indexing.
Powerful computers - yes, compared to my old TRS-80 Model I. But then,
we're doing so much more with them, even on a lightly loaded machine. Mine
is currently watching for USB devices, I have no idea what avahi is... and
a huge number of other things Gnome seems to consider as basic
functionality...
Ya the ability to hotplug USB is nice. Avahi is network autodiscovery
service. It is a work alike to what is used in OSX. So you can refer to
machines by "hostname.local" rather then looking up ip addresses (providing
mdns is setup correctly by your distro). It should allow you to setup
services on a network without having to configure a central DNS server or
whatever.
Why is running a modern Linux starting to look so much like running
Windows? Not just the GUI, but when you dig down underneath?
People tend to see only what they want to see. Linux is more complicated
then it used to be, but it is also much more usable. This complexity is not
needed on all systems and it is all configurable; I can still boot modern
Linux with Busybox from a single floppy (last time I tried was when 2.6.29
was new) and Debian can happily run most your services with under 128MB of
RAM. It is still a open
system and it is not difficult to understand what everything does and what
it is doing if you want to take time to learn new stuff occasionally.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 7:55 UTC (Mon) by Janne (guest, #40891)
[Link]
"I disagree about the PEBCAK view of where my slowdowns are."
You just don't notice the slowdowns that are caused by you. When you are wasting your time looking for a file (for example), that doesn't seem like a slowdown to you, because it seems to you that you are actually doing something. But when it takes fraction of a second too long for a folder to open, you feel like it's holding you back. So you end up wasting few seconds waiting for the computer to react to your commands.... while you wasted 10 minutes looking for that file.
You doing stuff that should be done for you don't feel like wasted time, because you are actually doing something during that time (browsing folders, going through emails etc.). But when your computer stalls for a fraction of a second, it's jarring. But fact remains that you lose a lot more time hand-holding the computer than you do when you wait for the computer do follow your orders.
There is perceived latency (how the slowdown feels to you) and actual latency (how much time is actually wasted). PEBKAC increases the latter, while it doesn't contribute to the former. Computer-slowdown contribute to the former, while they have small contribution to the latter.
"Right now, I have a hundred or so tabs open in Firefox"
Ever thought about closing some of them? How can you even find the tab you are looking for, when you have to hunt for it among hundred tabs?
"And you want an indexing daemon hitting those hard drives almost constantly????? "
It's weird how OS X manages to solve that problem... I don't have that problem, even though my HD was just a 5400rpm laptop-HD! Seriously, the only time there _might_ be slowdowns caused by the indexing, is when the initial index is created. After that it's just incremental updates to the index.
And like I said, being able to find anything instantly saves A LOT more time than occasional indexing of the HD slows you down.
"They're slow enough already, and I don't hear of anything on the horizon that's going to significantly speed them up!"
SSD's are A LOT faster than old-fashioned HD's.
"But then, we're doing so much more with them, even on a lightly loaded machine."
And that's how it should be. Yet here we are arguing whether using all that power and capabilities to help the user is a good thing or not....
"Why is running a modern Linux starting to look so much like running Windows? Not just the GUI, but when you dig down underneath?"
What is "Linux"? If you want barebones Linux-distro, there are options for that. Ubuntu and the like (including upstream-projects like KDE and GNOME) aim at creating a complete package that does as much as possible for the user. If that is not what you want, maybe you should run something else instead?
And I bet that you are using quite a bit of that "useless" background-functionality, you just don't know it.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 12:22 UTC (Mon) by nye (guest, #51576)
[Link]
>You just don't notice the slowdowns that are caused by you. When you are wasting your time looking for a file (for example), that doesn't seem like a slowdown to you, because it seems to you that you are actually doing something. But when it takes fraction of a second too long for a folder to open, you feel like it's holding you back. So you end up wasting few seconds waiting for the computer to react to your commands.... while you wasted 10 minutes looking for that file.
I don't understand how anyone could possibly take that long to find a file. I have a couple of terabytes of data stretching back years. It's poorly organised with a lot of redundant copies of things, some of which may be slightly different, and if I spend more than 5 minutes trying to find something, I'd be damn sure it didn't exist - more than that though, I've seen no indication of how this sort of indexing could possibly help that. The only examples of 'intelligent' searching that people ever give are so trivial as to seem pointless.
>"Right now, I have a hundred or so tabs open in Firefox"
>Ever thought about closing some of them? How can you even find the tab you are looking for, when you have to hunt for it among hundred tabs?
Why would that be difficult?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 17:26 UTC (Mon) by dkite (guest, #4577)
[Link]
Searching is a dead end. There was an application in the '80s that indexed
files. The trade journalists loved it, but no one else had a use for it.
I had a discussion a few years ago on a kde irc site, and asked how would
it improve # cd ~/Mail # grep -r whateverimlookingfor *
There wasn't a good answer.
But. If the data was structured, would that make a difference? If the email
was connected in some way to the 4-5 files of other data formats that you
were perusing at the same time?
A stand alone application that does this stuff will be equally useless. It
will only be useful if applications use the structured data store in some
way.
If you start entering an url in Chrome browser, it talks to google and
finds urls, not search terms, urls that match. Very useful, and an example
of having a structured data store.
Plus, with an api, if I want to index and search data within my
application, I don't have to roll my own.
Derek
The Semantic Desktop Wants You (KDEDot)
Posted Oct 31, 2009 16:45 UTC (Sat) by jospoortvliet (subscriber, #33164)
[Link]
Actually I'm often looking for things where # cd ~/Mail # grep -r
whateverimlookingfor * would not be helpful at all.
I'm looking for a certain email but I don't really remember any specific
term in it - just know what it was related to. If some tool would be able
to figure out that relationship and find the file for me, I'd be
incredibly happy. Not saying Nepomuk will solve that (though it wants to)
as I can be very vague in my recollection (or lack there-of). Anyway,
point is, full text search often does not help me so I am hoping Nepomuk
can do better.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 13:11 UTC (Mon) by renox (subscriber, #23785)
[Link]
>I'm getting spurts of being 10-20 characters ahead of the display.
It's most probably Firefox the culprit (amusingly its design isn't very good for multi-tabbed browsing), there's no good solution currently, you must wait either for Chrome to have a Flash-blocker and run on Linux, or wait for Firefox4 which is supposed to change this..
>[cut]And there's another thing - this file indexing is going to cause a *HUGE* non-locality of data.
Somewhat true. Note though that unchanged files don't need to be reindexed.
>It's going to *KILL* the IO caching,
Uh? Only if it's poorly implemented: a good indexing application will throttle itself to avoid this and ask the OS to not cache the data read, a badly implemented indexer would indeed kill the IO caching.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 13:36 UTC (Mon) by NAR (subscriber, #1313)
[Link]
Does it make the system slower? I haven't really noticed. Even if it consumed 10% of my CPU-power (it doesn't) why would it matter since most of the time my CPU is more or less idle?
First of all, it's usually not the CPU that's the bottleneck, but the I/O subsystem - and it's very noticable if some "background" process uses the I/O subsystem heavily. Secondly, I use GNOME which has a process called beagle which is I think something similar to this kind of stuff. I regularly see beagled using 100% CPU, then I kill it. I also regularly see beagle filling up my quota with the indexes. I mean, do we really need a 170MB index for 2GB data when most of the data is in binary files? I've yet to use beagle to actually search for something, though.
I think there are two problems with these indexing softwares. First is the actual implementation - if it crashes, uses to much resources, etc. then it's just a nuisance, not a service. Second, I'm afraid those users who can't keep their files in order are too "technically challanged" to use an indexing service too. Windows has some kind of searching facility for years (maybe decades), still, some people I know can't find any files they've created.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 10:27 UTC (Sun) by Janne (guest, #40891)
[Link]
"Because we use the computer power and memory for real work?"
99% of the time, the performance-bottleneck is between the chair and the keyboard. Nepomuk
and the like is meant to reduce the impact of that bottleneck. Even if it reduced the available
CPU-horcepower by few percentages, it would still be worth it, since CPU-horsepower is not the
bottleneck in today's computers. We have HUGE amount of CPU-power available, while the
resources of the _user_ has remained the same. So why not use that excessive amount of CPU-
power to make the user more productive?
"Anything that increase latency slowdown real work."
Misplaced files and the like increase latency. Spotlight in OS X has saved my bacon more than
once. If I could find connections between different files I would be even more productive, even if
it consumed few percentages of my CPU-power. And that's what Nepomuk does. Nepomuk is
about reducing a bottleneck in our workflow. And that bottleneck is the user, not the CPU (or
RAM, or HD-space).
It's disingenious to think that "anything that consumes RAM and/or CPU-resources is a bad
thing, since I can't then use those for real work!". Isn't it smart to use those available resources
to make our work easier and faster? If you want maximum amount of CPU-power and RAM
available, then surely you are using CLI as opposed to GUI? Surely you are using Links as opposed
to Firefox? I mean, we need to save our precious CPU-cycles for real work, right?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 21:53 UTC (Sun) by ballombe (subscriber, #9523)
[Link]
> 99% of the time, the performance-bottleneck is between the chair and the keyboard.
This is an unsubstantiated claim that is clearly not true for a lot of people I know.
> Misplaced files and the like increase latency.
Agreed. Never misplace file. Softwares should not force you to save everything to ~/Desktop and then sort out the mess.
> It's disingenious to think that "anything that consumes RAM and/or CPU-resources is a bad thing, since I can't then use those for real work!". Isn't it smart to use those available resources to make our work easier and faster? If you want maximum amount of CPU-power and RAM available, then surely you are using CLI as opposed to GUI?
Is it not some kind of ad hominem attack now ? But to answer your question, I stopped using X for work eight years ago.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 7:37 UTC (Mon) by Janne (guest, #40891)
[Link]
"This is an unsubstantiated claim that is clearly not true for a lot of people I know."
And it is true for many people that I know.
"Agreed. Never misplace file. Softwares should not force you to save everything to ~/Desktop and then sort out the mess."
So, we are now supposed to hand-hold the filesystem? We are required to create elaborate folder-structures where we can neatly (or not so neatly) arrange our files so we never lose anything? And even if we create those folder-structures, we could still lose files. Is that file in folder X or folder Z? What if that folder contains 30 files, should you look them over so you could find the one file you are looking for? Should you waste your time drilling down on multitude of folders?
This is exactly what I meant by letting the computer handle the ugly stuff, while the user can focus on actually doing useful stuff with his computer. Manual housekeeping of the filesystem is exactly the kind of thing that the computer should be handling.
"Is it not some kind of ad hominem attack now ?"
Um, no? If I attacked your person (your physical attributes for example), then it would be a personal attack. But I'm attacking your arguments.
"But to answer your question, I stopped using X for work eight years ago."
So why are you here complaining about GUI's if you don't even use one?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 8:01 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
I don't think that anyone is disputing that this sort of capability will be useful for some people.
what worries me (and what I think worries most of the people with negative comments) is the fear that this will get built in to the infrastructure to the point that we don't really have the option to not use it.
the people who are developing this understandably think that it's great, and the way it's being talked about imply that this will be a fundamental part of KDE in the future.
the reason that the linux desktop is bloated and slow today is that there are dozens of ideas like this that are useful to some people some of the time all running all the time for everyone.
yes, for some people it doesn't matter, they have a computer that's fast enough. but this sort of thing is why you don't want to run a 'standard' desktop on a netbook, even though a netbook has specs that not too long ago would have been considered a power-user machine.
having this as an option is not a problem.
having this option turned on by default may be a problem
making it so that it's very hard to turn this option off is definantly a problem.
forcing someone to use a different desktop to avoid this problem is even worse.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 8:38 UTC (Mon) by Los__D (subscriber, #15263)
[Link]
<p><i>forcing someone to use a different desktop to avoid this problem is even worse.</i></p>
This seems like the very best option of them all; A good desktop is focused. KDE and GNOME are clearly both exactly made for this kind of thing. If you want a no-frills, non-feature desktop, use one.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 8:48 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
I'm sorry to hear this.
do you know if the Gnome and KDE project leads agree with this? if so I know that I need to start working to learn and use one of those 'no frills' desktops and not waste my time on a desktop that I can't tune down to use on lower powered devices.
In the meantime I hope that someone develops a desktop that lets me select the frills that I want instead of taking the attitude that I must want all of them
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 14:16 UTC (Mon) by Los__D (subscriber, #15263)
[Link]
I have absolutely no idea what the GNOME and KDE project stance is.
While we agree that if there's an easy way to disable a service, it should of course be offered, but if integration makes it impossible, so be it.
That said, features should of course come at as low a performance cost as possible, and high-cost features should have equally high (positive) impact.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 17:31 UTC (Mon) by dlang (✭ supporter ✭, #313)
[Link]
quote:
While we agree that if there's an easy way to disable a service, it should of course be offered, but if integration makes it impossible, so be it.
actually, what I am saying is that if there isn't an easy way to disable a service, it shouldn't be integrated.
that's a very different stance than you are taking.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 27, 2009 6:04 UTC (Tue) by Los__D (subscriber, #15263)
[Link]
Oh, absolutely. The first pat on the sentence was the agreement, the last part is the disagreement.
We cannot allow ourselves to be held back by junkyard computers.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 27, 2009 7:04 UTC (Tue) by dlang (✭ supporter ✭, #313)
[Link]
what would you call a machine with a 600MHz cpu, 256M of ram and 4G of storage.
you seem to call it a junkyard computer that linux should not bother to run on.
and if it was a 10 year old 600MHz Athlon, pentium II, or Mac you would be right about it being a junkyard computer.
however, those are also the specs on my brand new state-of-the-art Kindle DX
they are also very close to the specs of the latest Nokia N900 tablet/smartphone that will be released in the next few days.
so depending on the size of the case, those specs can either be obsolete junk, or the latest and best thing available.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 27, 2009 8:39 UTC (Tue) by Los__D (subscriber, #15263)
[Link]
No, I'm saying that GNOME and KDE isn't the kind of desktop that should "bother" to support that kind of computer.
- The "junkyard computer" was a bit over the top, and I'm sorry for that.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 31, 2009 17:01 UTC (Sat) by jospoortvliet (subscriber, #33164)
[Link]
Funny, btw, we (KDE) DO bother to support that. Lots of work and energy
are directed at the N900 and similar devices currently. And yes, we do
strive to have Nepomuk (like) technology running on it. Actually, Tracker
(which offers similar functionality) is already an integral part of the
software stack on the N900.
Managed properly it should be doable on a device like that. And improve
workflow.
I know resource usage is a sour spot with many FOSS people, and frankly,
completely understandable. But we must look at the bigger picture here.
KDE won't be used by 20% of the computer users worldwide anytime soon
(well unless the N900 and it's successor really take off). So in a year or
3 from now, a LOW-END phone will have the hardware you describe, and even
a 5 year old desktop will be 4 times as strong as that. By then, a few IO
(SSD anyone?) and CPU cycles aren't a huge issue.
Functionality and ease of use DO use CPU and IO cycles. Sorry if this
kills a few dreams, but: you CAN NOT have your cake and eat it too. Even
if you start rewriting a modern desktop like KDE 4.x or upcoming Gnome 3.x
in assembly it won't run on a LISA or Apple II. Period. Luckily, hardware
gets better - at the end of the 80s the mere thought of interpreted
languages was NUTS - these days, browsers can't do without and think about
rewriting parts of the desktop in it. If the researchers from that time
had not thought 10 years ahead, we wouldn't have javascript, ruby, python.
We in KDE are looking forward. Yes, for some, that's not very nice. But be
honest - how many 'average' users have moved on to KDE 4.3 yet? Most are
years behind the state of the art software. Half the world still uses
Windows XP. Released in 2001, remember?
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 17:04 UTC (Mon) by wstephenson (subscriber, #14795)
[Link]
I (and others) agree with you, which is why KDE on openSUSE 11.2 does exactly this. Following the furore around Beagle being enabled by default, Nepomuk (and by extension its Strigi-based indexer) are disabled out of the box by default. If you want to enable them, it's 2 clicks in Configure Desktop->Desktop Indexer to enable. If you don't enable them, the semantic tagging/rating/searching parts of the UI are not shown.
This default was taken with the agreement with the Nepomuk maintainer, as this will allow enough people to experiment the semantic desktop without forcing it on everyone and creating a backlash.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 27, 2009 22:42 UTC (Tue) by ballombe (subscriber, #9523)
[Link]
> And it is true for many people that I know.
I do not discuss that, but this is still not sufficient to claim that '99% of people' do that.
> Um, no? If I attacked your person (your physical attributes for example), then it would be a personal attack. But I'm attacking your arguments.
If you are attacking my argument, then the way I use my computer is irrelevant, so why are you asking questions about my computing habit ? And now you ask
> So why are you here complaining about GUI's if you don't even use one?
This is clearly ad hominem, and wrong.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 28, 2009 9:21 UTC (Wed) by paulj (subscriber, #341)
[Link]
> So why are you here complaining about GUI's if you don't even use one?
This is clearly ad hominem, and wrong.
No it's not. He's questioning your standing to argue this point. Life is too short
to argue every point with everyone, you have to be selective.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 2:59 UTC (Sun) by horen (subscriber, #2514)
[Link]
My "everyday" laptop is an IBM ThinkPad A20P, with a PIII/733MHz CPU, running WattOS; my workstation is hand-built (by me), with dual PIII/1.4GHz CPUs, running Linux Mint 7 (KDE CE). My "emergency" laptop is a Toshiba 2450cds, with a KII/333MHz CPU, running TinyMe Linux.
I don't play games, and have no reason, nor intention, to "upgrade" my computers. I'm also a Unix/Linux sysadmin (since 1988), unemployed for almost four years (no college degree, in this "Brave New World" of post-9/11 America). I also don't own a car, and ride a bicycle, instead.
There is a lot of functional, older hardware out there; and, while "smarter desktops" might be useful for some, many of us find ourselves fighting for lightweight, less "featured" desktop managers/environments.
Seriously.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 3:13 UTC (Sun) by Kit (guest, #55925)
[Link]
Some people want light desktops that do the bare minimum. Others want more featured desktops that'll take more system resources to be able to do more, hopefully easing the user's workload.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 10:43 UTC (Sun) by Janne (guest, #40891)
[Link]
"I don't play games, and have no reason, nor intention, to "upgrade" my computers."
Then don't. Use your old computers and keep on using GUI's designed for such computers. But
then don't start whining about not being able to use newer technologies. Hey, I bet that someone
who still uses 20 year old computer might start whining that he can't use a GUI at all, and how
efforts to make GUI's better are a "wasted effort"....
That said, my main workhorse is a MacBook Pro with 4GB of RAM and 2.4GHz dual-core CPU,
and I welcome any technology that makes me more productive. And guess what? I don't go
around complaining about design-decisions of Fluxbox, XFCE or some other lightweight-GUI
since I'm not an user of that software.
Besides, you could buy an uber-powerful computer with few hundred dollars, it's not like you
need to be swimming in money to get powerful hardware.
"There is a lot of functional, older hardware out there; and, while "smarter desktops" might be
useful for some, many of us find ourselves fighting for lightweight, less "featured" desktop
managers/environments."
Go right ahead then. No-one is forcing you to use GNOME, KDE or anything like that. But I fail to
see what grounds you have to complain then. You have alternatives that are suitable for you, go
right ahead and use those. If KDE (or GNOME, or whatever) create new technologies that might
benefit from newer hardware, then I fail to see what grounds you have to complain, since you
aren't even in the "target market". I don't go around complaining how Ferraris consume too
much gasoline for my wallet, since I'm not going to buy one in any case. You want to use
lightweight, basic GUI's, so why are you complaining about more advanced GUI's if you aren't
even using one?
Besides, your comments make it sound like that you DO have a reason to get more powerful
hardware, contrary to your claims. Not being able to run more modern desktops (that you might
want to use) with your current hardware is a valid reason to upgrade, is it not?
"Seriously."
Indeed.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 31, 2009 17:05 UTC (Sat) by jospoortvliet (subscriber, #33164)
[Link]
Well, sorry but you're a minority. Especially around the time this stuff
would hit the average user - they are generally 2-4 years behind on
cutting edge. By then your hardware fits in a low-end mobile phone. We're
looking forward with this stuff, not backward.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 16:40 UTC (Sat) by endecotp (guest, #36428)
[Link]
If the system hides stuff so well that we need something like this to find it and join it together, then that tells me that we're storing the data wrongly in the first place.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 24, 2009 21:19 UTC (Sat) by jospoortvliet (subscriber, #33164)
[Link]
Maybe, maybe not. You see, the days of 25 files on a floppy disk are
over... There is so much data these days, with so many connections between
them, it's impossible for a normal user to keep track of it. Movies,
music, documents, pictures... And just think of all the social media
accounts, the data they host for you. The different initiatives in KDE
combined can be very powerful in helping you control all this. Nepomuk
indexes and relates, Akonadi stores, Silk tries to bring the cloud to your
Plasma desktop using various technologies like Jolie. Combine this, and
you're data is yours (Akonadi), even if it's online (Silk, Jolie) and easy
to find (Nepomuk); all presented on a flexible Plasma desktop.
Sure, dreaming, we're not there yet. But the technologies are quickly
maturing. Akonadi is ready for real use now, and with KDE 4.4 apps will be
using it - 4.5 might see a wholesale migration. Nepomuk - exactly the same
story.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 0:29 UTC (Sun) by dkite (guest, #4577)
[Link]
I have a collection of PDF's from the equipment I have installed in the
last couple of years. I have emails and documents exchanged during the
jobs. I have estimates in word processor files. Eventually the jobs were
invoiced, sometimes there are warranty issues, completion and certification
documents. We get emailed invoices from our suppliers, tagged to these
jobs. My calendar is log of work done on this stuff.
Not unusual, but a whole bunch of data that needs to be kept around.
Previously we would have file folders with all these documents, filling
boxes in the back room. With that system, it is easier to go to the site
and read the serial number off the equipment as opposed to digging through
a box of files to find some obscure piece of paper.
It is possible to build an application to keep track of all this stuff. A
database with binary globs, all sorted, indexed and the like. But what I
need to do is different from what you need to do. And I'm not in an
industry with huge support from software vendors. There is choice, but all
of them assume a way of managing the business that may not be my way, and
may need a bunch of people in the office to keep the whole thing current.
So what if the stuff was sorted and indexed? Not simply a per word index,
but on ontologies or structures of specific data?
What if my application using the indexing and ontology of nepomuk (or
similar) to not move around, file or whatever these disperse sources of
stuff, but rather noted it's existence and noted any links?
So then a few years later, I need an install date and serial number for
some piece of equipment. Both those data points were on the equipment
invoice from the supplier, and the customer invoice, plus maybe some
notation on my calendar. I could search the invoices through my financial
software. I could scroll through, search a bit through my calendar to find
a date range. Then I could look for a supplier invoices (pdf, scanned
documents, etc.) around that time until I find what I need.
Or the indexing, with ontologies, can give me the documents tied to the
customer. Maybe the job. And narrow by supplier. And then some fine tuned
simple data indexing. And I get what I need.
This stuff can change how we write applications. It can make it useful and
organizable to run whatever application serves best for the specific
purpose you need, without having to worry about losing the link to a
defined workflow.
Indexing is useful, but tying it to structured data formats like these
ontologies makes it powerful.
Derek
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 0:33 UTC (Sun) by Kit (guest, #55925)
[Link]
>If the system hides stuff so well that we need something like this
>to find it and join it together, then that tells me that we're
>storing the data wrongly in the first place.
I have approximately 3,500 songs on my laptop. There's only so much a folder structure can do with that many songs. Thankfully, most decent media players will index the music and I'll be able to search via tags. Nepomuk is an extension of that, opening the info to applications OTHER than just the media player that has indexed it (and, potentially simplifying the development of media players).
I also have over 2,300 pictures on this computer from various events and trips, each with different people. I have over 1,100 pictures from a single trip, taken in various cities. There are some photo management applications out there, like Digikam, but they're usually rather large and complicated. So far for that trip I've only "sorted" (i.e. putting them in named folders) about 200 of the pictures, and only by the city (most of which in a single city). Being able to also easily tag it with info such as what it specifically is of, people in it, the day, etc, would be VERY nice... it'll also be VERY nice to be able to easily find the picture based on the tags, even OUTSIDE such photo management applications.
Then, there's about a billion (no, not a literal amount, just more than I want to try to calculate) misc documents stored in a variety of locations, many created/downloaded on this computer, and many more recovered from a prior system that died on me. At first, the organization worked fine, but the number of documents (PDFs, word documents, spread sheets, power points, etc) has FAR out grown the folder structure.
A clever folder structure could help in all those situations (I have my music sorted: Artist/Album/Song), but all of those methods enforce one way of attempting to find something, but fail when approaching it any other way (what if I remember the song name, but forgot the artist/album? What if I want to look at all the songs by a specific artist? etc). Symlinks could help those situations, but would require a MASSIVE amount of work, and require predicting all the possible ways I'd want to filter before hand.
Desktop search helps with this, and Nepomuk is basically desktop search++... or more so, an extension to it (which may or may not make the analogy accurate depending on how you feel about C++ :P). Nepomuk involves using the data that Desktop search collects, in addition to the data the applications ALREADY have (but often just opt to lose/force the indexer to manually discover), more intelligently. It's definitely not an easy problem, but it has the potential to be a great benefit for DESKTOP users that have thousands and thousands of documents... a conservative estimate of the number of *personal* files I have on JUST THIS MACHINE is 8,500... I also have over 1,300 emails on my email account, but I use webmail for that as I've grown to hate all the email clients I've used... including Evolution, Thunderbird, Kontact/KMail, and many other OSS ones (they're either slow, crashy, huge resource hogs, cumbersome, or have horrible search mechanisms... or all the above).
Folders work well when you have a small number of files. But as you get more and more files, they begin to fail (extra pain when you have multiple flash drives with differing folder structures!).
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 4:10 UTC (Sun) by dkite (guest, #4577)
[Link]
And to repeat myself, it will be useful when automatic.
And to do what you or I want automatically requires serious technology. Page
recognition, face recognition. Location for photos, ie. gps tags from the
camera. Maybe even music recognition.
If I have to fill in tags, forget it. Won't happen. But applications
generally know the data they are dealing with, especially if there is an
index that they can lookup and find, for example, if I enter Nelson, B.C. it
can find another occurrence of that string that is tagged 'city,province' or
whatever the ontologies call them.
Not trivial problems to solve, but this stuff gets me excited.
Derek
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 13:35 UTC (Sun) by Kit (guest, #55925)
[Link]
>And to repeat myself, it will be useful when automatic.
>And to do what you or I want automatically requires serious
>technology. Page recognition, face recognition. Location
>for photos, ie. gps tags from the camera. Maybe even music
>recognition.
Definitely. There are already many rather reliable ways to recognize and auto-tag music out there (Amarok at least in the 1.x series had a really good one, along with the lyrics plugins to fetch the lyrics for the songs), unfortunately the others don't seem to have even close to as reliable automated methods yet (at least from a purely-software point of view... a camera with integrated GPS for geo-tagging would be very handy).
The Semantic Desktop Wants You (KDEDot)
Posted Oct 25, 2009 23:05 UTC (Sun) by endecotp (guest, #36428)
[Link]
> I also have over 2,300 pictures on this computer
I'll just pick up on that because it's an area where I have some personal experience. I also have a collection of thousands of photos and I built myself a TV-sized digital photo frame to view them on. It indexes the photos by location on a world map, on time-lines, and according to various tags.
My comment that "we're storing the data wrongly in the first place" can be illustrated by this. Let's say that I wanted to merge my photo timeline with my email timeline. I have all my email in a database too so it should be easy, but guess what - EXIF photo timestamps don't have a timezone. So fundamentally any attempt to merge those timelines won't work properly.
What has happened is that every different media type has got its own incompatible and incomplete metadata. Each new file format has re-invented the wheel of how to store even the most basic attributes. This should not be too surprising since the media formats were developed by specialists in those domains.
I believe that searching and indexing within domains (e.g. searching all my photos, or all my songs, or all my emails, or all my source code, or all my PDFs) is reasonable and useful (and I do most of those things). However, because of the disconnection between the different types, unification is hard to do and unlikely to be very successful. Furthermore, I'm unconvinced that unified searching is actually needed in many real-world situations.
Anyway, I hope you have fun trying.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 14:09 UTC (Mon) by Kit (guest, #55925)
[Link]
>It indexes the photos by location on a world map,
>on time-lines, and according to various tags.
...
>I have all my email in a database too so it should be easy,
>but guess what - EXIF photo timestamps don't have a timezone.
>So fundamentally any attempt to merge those timelines won't work
>properly.
Ah, but you already have all the metadata you need! If you already have the photo's location, you should be able to derive the timezone of the photo from that information.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 17:35 UTC (Mon) by dkite (guest, #4577)
[Link]
>What has happened is that every different media type has got its own
incompatible and incomplete metadata. Each new file format has re-invented
the wheel of how to store even the most basic attributes.
And that will never change. But the application that handles, say, photos
knows about these quirks. If the application developers had a good indexing
api available to them, with the capability to define structures within that
api for their specific data, then possibilities start presenting
themselves.
Not for you, but for application developers.
I have used desktop search maybe twice. Useless and unused. But I
absolutely enjoy the Chromium url entry which accesses the google data
store.
Derek
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 20:34 UTC (Mon) by cmccabe (subscriber, #60281)
[Link]
> A clever folder structure could help in all those situations (I have my
> music sorted: Artist/Album/Song), but all of those methods enforce one
> way of attempting to find something, but fail when approaching it
> any other way (what if I remember the song name, but forgot the
> artist/album? What if I want to look at all the songs by a
> specific artist? etc).
I store my music in a specific folder structure, kind of similar to yours. I wrote a script to auto-generate folders full of symlinks so that I could access my files by either artist, or album, or song.
It works pretty well and doesn't require SQL, Java, C#, or daemons. My approach may not satisfy everyone, but it works for me.
> Symlinks could help those situations, but would require a MASSIVE
> amount of work, and require predicting all the possible ways I'd want
> to filter before hand.
Whenever you build an index, you are "predicting all the possible ways [you'd] want to filter before hand."
For example, Google indexes documents by common words. They don't offer (for example) a way to search for documents with a prime number of words in them, because people don't find that useful.
> Being able to also easily tag [pictures] with info such as what
> it specifically is of, people in it, the day, etc, would be VERY nice
Have you thought about using POSIX extended attributes? You can store a lot of attributes this way and it doesn't rely on any particular framework or daemon. It also has the nice advantage that you can move files from place to place without losing your tags. (As long as the destination filesystem supports extended attributes.)
C.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 26, 2009 21:10 UTC (Mon) by nix (subscriber, #2304)
[Link]
EAs have the disadvantage that you have to write to the files in order to
index them, and that they aren't independently dated (changes to them bump
the file's mtime), so they utterly screw backup programs.
The Semantic Desktop Wants You (KDEDot)
Posted Nov 3, 2009 9:21 UTC (Tue) by cmccabe (subscriber, #60281)
[Link]
> EAs have the disadvantage that you have to write to the files in order to
> index them, and that they aren't independently dated (changes to them bump
> the file's mtime), so they utterly screw backup programs.
It just makes sense to keep the metadata of the file in the same place as the file. Think about how hard it would be to give your photo collection to a friend if all the tags were in a SQL database somewhere. "Just copy these files to exactly the right location, install this pile of software, run this database import script..." In contrast, if you store the metadata in the file itself, all of these problems go away.
Think about the ID3 format which stores artists, titles, and so on for MP3 music files. Changing the ID3 of an MP3 "screws backup programs" much more than changing an extended attribute on the file. This is because ID3v2 tags are of a variable length and usually occur at the start of the file. So basically, to change anything, you have to rewrite the whole file, in most cases. People accept this flaw, because ID3s give them something they really want-- a way to permanently label the file. With ID3, you can rename your MP3s, transfer them to a different computer, or load them in a different music player, and still see the same artist and title information.
On a side note, if you are concerned about the mtime thing, you could write a wrapper script around fsetxattr that manually sets the mtime back to what it was at the beginning of the script.
Basically, I regard desktop software as ephemeral. I was running KDE a while ago. Now I run GNOME. In a few months, I'll probably run a different version of GNOME, or maybe something else entirely. I'm happy to use indexing services when they exist (I like slocate a lot), but for storing my actual data, I'll take the filesystem any day. It is better debugged, better supported, and more likely to be around in 5 years than any other kind of data store.
C.
How to use it
Posted Oct 26, 2009 10:57 UTC (Mon) by ikm (subscriber, #493)
[Link]
Could somebody please enlighten me how exactly to use this Nepomuk? I've got a KDE4 install, and on the first run it showed some Nepomuk-related messages. But that was the first and the last time I've seen or heard from it. Can I actually use it to find something? How?
How to use it
Posted Oct 26, 2009 12:32 UTC (Mon) by wstephenson (subscriber, #14795)
[Link]
Assuming it's all installed correctly (ie you have the sesame2 backend installed) and you're on KDE 4.3:
Open Dolphin. Make sure the Information panel is visible (Press F11 if not, or find it in View->Panels.
Select a file. Notice there is a bunch of data on the file in the Information panel now, including a star rating widget.
Rate the file 5 stars (make sure it's a full 5 not 4.5)
Now in the Search... bar, enter 'rating:10', press Enter, nepomuk searches and shows the rated file.
Alternatively, set a tag 'kde' using the Add Tag widget, then search for it with 'hasTag:kde'.
If you just enter a bare string, it searches the file content indexed by Strigi, but indexing may not be enabled by default. You can turn it on with systemsettings, Advanced->Desktop Search->Enable Strigi Desktop File Indexer.
I've got svn (KDE 4.4 pre) installed here and the search widget has gotten a lot smarter so it suggests the attribute you may want to search for as you type eg tag, title, rating, and logical operators. There is also the beginnings of a graphical search builder where you can add search terms and operators. "Rating:2 and hasTag:demo" worked perfectly.
When I noticed that Strigi also indexes the exif tags from my pictures, I was surprised to find that i could search for Model:NIKON D90 and it worked. However, when I tried to combine that with 'and Rating>=4' the search service hung up, and it was unable to handle fields containing spacers like 'ISO Exposure'. I think it shows promise though.
The Semantic Desktop Wants You (KDEDot)
Posted Oct 27, 2009 13:42 UTC (Tue) by freealter (guest, #4335)
[Link]
There is a question coming back and back : how do you use these features ? Mandriva has worked from the very beginning on Nepomuk. Our next release will introduce a "task oriented desktop" fully based on Nepomuk. It allows you to easily tag web pages, files and mails with your tasks, and gives you an interface (tasktop) for displaying all these informations in a single interface (also possible in dolphin). http://doc4.mandriva.org/bin/view/labs/Nepomuk%2Dmdv2010%... (for the details)/
And, yes, it has the infrastructure and some basic functions for automatically proposing tags from the text of mails, and also, we have a nice annotator that recognizes text in images. We are going to much improve these semi-automatic annotation functions (thanks to the Scribo research projects) in the next months.
We have chosen one way to group informations (tasks) but there are other as the Zeitgeist project shows. What is important is that today applications are isolated from one another. They share the screen, the storage, the network, and data through the clipboard. Nepomuk technologies allow to have a new integration media between applications. An integration that today, the user is doing himself.
I promise to you, it is going to change the desktop much. And it will not be "a big sheety process eating my I/Os and my CPU", it's going to change the way we interact with our desktops.