
Source (link to git-repo or to original if based on someone elses unmodified work): Add the source-code for this project on opencode.net
********************
OBSOLETE
********************
Get GoogLyrics: http://kde-apps.org/content/show.php?content=73850
NOTICE
* DO NOT USE THIS SCRIPT RIGHT NOW *
An update will be available soon that will fix the current Scroogle issue. Sorry for any inconvenience. A renaming might be inorder, so, watchout for a Googlyrics
I've been annoyed by the low yield on some of my more obscure music with many of the Amarok lyric search scripts available. I'd used a program waaay back when in my windows days called EvilLyrics, they had an interesting idea, Google for the lyrics then rip them off known websites. Well, I decided this is what Amarok needed! After a quick day of scripting, here's the result: a true lyrics metasearch script. Since google's SOAP API is crap (now doesn't allow you to register keys and returns dramatically different results to web search), I decided I'd scrape. Since Scroogle had a cleaner markup, I decided it'd be even better to scrape then just Google (and the paranoid might like me more or something). Anyways, the script searches scroogle (hence the name Scrooglyrics) for the song lyrics, then pulls them off the known sites. Adding sites should be easy so, tell me what sites you'd like to see and I'll be sure to get them in next release! Remember, these sites should be reputable ones that show up in the first 20 results on google for your song's lyrics.
To use this script you must have the WWW::Mechanize perl module installed. Your distrobution probably has a package for it, but, in the case they don't, you can do this the perl way, su to root then use cpan to install each of them, ex:
$su
#cpan
cpan>install WWW::Mechanize
---- OR ----
If you're on ubuntu or debian, you can simply apt it with this command:
sudo apt-get install libwww-mechanize-perl
I'd like as much input on this as possible, so, if you've tested it, please comment! Your opinion is wanted :)
If you're able to, please report bugs directly to our bugtracker at:
http://udmp.info/mantis
12 years ago
0.11:
- Major rewrite
- Now easier than ever to add lyrics sites or disable them in the source (options screen coming soon)
- enhanced whitespace cleaning.
- overall cleaner code
- Now spoofs user agents to IE6 on Windows to work around any weak blocking attempts.
0.10:
- Multiple search queries to find song, starts most accurate, works down to least (this should remove errors where an incorrect page is ripped)
- Added support for themadmusicarchive.com
- Updated sing365 regex
0.9:
- Supports lyriki.com
- Supports lyricspy
- Now supports local lyrics!
To use local lyrics, make a folder called "lyrics" in your home directory and store files named either as Title.txt or as Artist - Title.txt
Note that these are case sensitive, so be careful before you say it doesn't work.
0.8:
- Several major changes UPGRADE STRONGLY RECOMMENDED!
- Script will now fail properly if no lyrics are found
- Now fails properly when there's no connection
- Fixed several regex's which were not working properly
- Whitespace problems are now solved for several sites
0.7.4:
- No more crashing on connection problems
- Now removes parts of title in parenthesis.
- Added in README and COPYING for about dialog
0.7.3:
- Fixed bug in artists or song name containing special characters.
0.7.2:
- Fixed bug where artist name is not sent in search request
0.7.1:
- Added some extra debugging code
- Fixed lyrics007 linebreak bug
- Now removes starting "The" in artist names, seems to get better results
0.7:
- Removed depenencies on HTML::Entities and HTML::Strip, just needs Mechanize now.
0.6:
- BIG Update!
- Now searches songmeanings.net and wearethelyrics.com
- now only googles for results in sites needed
- now will keep looking if a regex fails instead of dying
- big thanks to mattepiu and neoeno for their work towards this version, great job!
- Stay tuned for a version that will include modules.
0.5.1:
- Fixed capturing for actionext.com and azlyrics.com
- Added a little extra debugging, if anyone's having problems, pull up the output window for the script and see if a regex has failed.
0.5:
- Fixed packaging error
0.4:
- Added letssingit.com and lyricwiki
- code cleanup, now passes use strict
0.3:
- Bug fix :) No more last-lyrics bug. I hope.
0.2:
- Added sign365.com support. I'll be working on that last-lyrics bug for next release.
0.1:
Initial release, currently has support for 3 lyrics search websites, azlyrics.com, lyrics007.com and actionext.com. This will probably be expanded later, but, I easily find most of my more obscure songs with just these 3! This my friends is why we need a lyrics metasearch.
12 years ago
0.11:
- Major rewrite
- Now easier than ever to add lyrics sites or disable them in the source (options screen coming soon)
- enhanced whitespace cleaning.
- overall cleaner code
- Now spoofs user agents to IE6 on Windows to work around any weak blocking attempts.
0.10:
- Multiple search queries to find song, starts most accurate, works down to least (this should remove errors where an incorrect page is ripped)
- Added support for themadmusicarchive.com
- Updated sing365 regex
0.9:
- Supports lyriki.com
- Supports lyricspy
- Now supports local lyrics!
To use local lyrics, make a folder called "lyrics" in your home directory and store files named either as Title.txt or as Artist - Title.txt
Note that these are case sensitive, so be careful before you say it doesn't work.
0.8:
- Several major changes UPGRADE STRONGLY RECOMMENDED!
- Script will now fail properly if no lyrics are found
- Now fails properly when there's no connection
- Fixed several regex's which were not working properly
- Whitespace problems are now solved for several sites
0.7.4:
- No more crashing on connection problems
- Now removes parts of title in parenthesis.
- Added in README and COPYING for about dialog
0.7.3:
- Fixed bug in artists or song name containing special characters.
0.7.2:
- Fixed bug where artist name is not sent in search request
0.7.1:
- Added some extra debugging code
- Fixed lyrics007 linebreak bug
- Now removes starting "The" in artist names, seems to get better results
0.7:
- Removed depenencies on HTML::Entities and HTML::Strip, just needs Mechanize now.
0.6:
- BIG Update!
- Now searches songmeanings.net and wearethelyrics.com
- now only googles for results in sites needed
- now will keep looking if a regex fails instead of dying
- big thanks to mattepiu and neoeno for their work towards this version, great job!
- Stay tuned for a version that will include modules.
0.5.1:
- Fixed capturing for actionext.com and azlyrics.com
- Added a little extra debugging, if anyone's having problems, pull up the output window for the script and see if a regex has failed.
0.5:
- Fixed packaging error
0.4:
- Added letssingit.com and lyricwiki
- code cleanup, now passes use strict
0.3:
- Bug fix :) No more last-lyrics bug. I hope.
0.2:
- Added sign365.com support. I'll be working on that last-lyrics bug for next release.
0.1:
Initial release, currently has support for 3 lyrics search websites, azlyrics.com, lyrics007.com and actionext.com. This will probably be expanded later, but, I easily find most of my more obscure songs with just these 3! This my friends is why we need a lyrics metasearch.
ultramancool
13 years ago
Report
PhobosK
13 years ago
Now to the problem - it worked for a while but it started to give an error in the WWW::Mechanize module (at the beginning happened randomly but now it is preventing the script from working as soon as any song is played). That is why I upgraded to 0.11 version but the problem still exists. More info:WWW::Mechanize tested with 1.20 (the original version in Gutsy) and 1.34 (latest version from CPAN) Amarok 1.4.7 Unicode system Nothing in the output script window The error is:
(WWW::Mechanize 1.20)Can't call method "value" on an undefined value at /usr/share/perl5/WWW/Mechanize.pm line 1107, <STDIN> line 1.
(WWW::Mechanize 1.34) Can't call method "value" on an undefined value at /usr/local/share/perl/5.8.8/WWW/Mechanize.pm line 1247, <STDIN> line 3.
PS. BTW the bug report system requiring sign-up is not very convenient for the users i think, but anyway it is your decision so i do not judge you.
Report
ultramancool
13 years ago
You're right about the bug report system, sorry about that.
Report
PhobosK
13 years ago
It happens practically with every song.
Here are some of them Title/Artist/Album:
01 - Dark Skies/Blutengel/Angel Dust - Ltd Edition
Adelante - Bonus/SASH/S4! Sash!
These Words/Natasha Bedingfield/Unwritten
05_ Why/DJ Sammy/The Rise
Report
PhobosK
13 years ago
LWP::UserAgent::new: ()
LWP::UserAgent::request: ()
HTTP::Cookies::add_cookie_header: Checking www.scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking .scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking scroogle.org for cookies
HTTP::Cookies::add_cookie_header: Checking .org for cookies
LWP::UserAgent::send_request: GET http://www.scroogle.org/cgi-bin/scraper.htm
LWP::UserAgent::_need_proxy: Not proxied
LWP::Protocol::http::request: ()
LWP::Protocol::collect: read 18 bytes
LWP::UserAgent::request: Simple response: OK
Can't call method "value" on an undefined value at /usr/local/share/perl/5.8.8/WWW/Mechanize.pm line 1247, <STDIN> line 3.
is:
HTTP/1.1 200 OK
Date: Mon, 14 Jan 2008 22:15:56 GMT
Server: Apache/2.0.51 (Fedora)
Content-Length: 18
Connection: close
Content-Type: text/plain; charset=UTF-8
Server too busy.
So the script fails.
I run simultaneous (sync) queeries from 3 diff IPs to http://www.scroogle.org/cgi-bin/scraper.htm
and the one from my IP fails everytime....
+ The queery CGI script: http://www.scroogle.org/cgi-bin/nbbw.cgi?Gw=point+of+no+return
gives forbidden error....
I am speechless ...
Report
ultramancool
13 years ago
Report
DanielBrandt
13 years ago
-- Daniel Brandt, Scroogle sysop
Report
ultramancool
13 years ago
Report
DanielBrandt
13 years ago
-- Daniel Brandt, Scroogle sysop
Report
qurk
13 years ago
Report
morethanskindeep
13 years ago
1) I have noticed one thing: not all lyrics on lyricwiki.org seem to be found by scroogle.
For example, I tried the query
"Otis Taylor" "Buy Myself Some Freedom" lyrics
giving only useless results, though the site http://lyricwiki.org/Otis_Taylor:Buy_Myself_Some_Freedom does exist. (Can that be because it was added only recently?)
2) Second of all, I was wondering what you think about adding support for local lyrics files -- first searching the file names in some local directory (see also http://kde-apps.org/content/show.php/Local+Lyrics?content=37981) and in case nothing is found asking scroogle.
I don't know enough to do that myself, but I can imagine that it is not overly difficult?
Thanks again... :)
Report
ultramancool
13 years ago
1) Yeah, that's just because the page was recently created and is yet to hit google, not much I can do about that, sorry.
2) This seems like a pretty good idea, I'll see what I can do, maybe for 0.9 :)
Report
morethanskindeep
13 years ago
1) I have noticed one thing: not all lyrics on lyricwiki.org seem to be found by scroogle.
For example, I tried the query
"Otis Taylor" "Buy Myself Some Freedom" lyrics
giving only useless results, though the site http://lyricwiki.org/Otis_Taylor:Buy_Myself_Some_Freedom does exist. (Can that be because it was added only recently?)
2) Second of all, I was wondering what you think about adding support for local lyrics files -- first searching the file names in some local directory (see also http://kde-apps.org/content/show.php/Local+Lyrics?content=37981) and in case nothing is found asking scroogle.
I don't know enough to do that myself, but I can imagine that it is not overly difficult?
Thanks again... :)
Report
v6lur
13 years ago
(www.lyriki.com)
Report
ultramancool
13 years ago
Report
stifi
13 years ago
One thing is really annoying. If some lyrics could not be fetched the error message Failed to find any lyrics. Press refresh to try again. is cached as lyrics in Amarok. If you use the notification "<?xml version=\"1.0\" encoding=\"UTF-8\" ?> <suggestions page_url=\"your_url\"></suggestions>" Amarok will display an error in the Context Browser and will not cache any lyrics [1].
I already changed the source for my needs. But maybe changing it in the public version my help others.
[1] http://amarok.kde.org/wiki/Script-Writing_HowTo
Report
ultramancool
13 years ago
Report
mattepiu
13 years ago
python-mechanize and stripogram
http://pastebin.ca/830909
Report
ultramancool
13 years ago
Report
mattepiu
13 years ago
so I recoded better and this time I removed the stripogram dependancy
(just one dependancy: python mechanize).
P.S.: I had to decode from utf8 the
song365 lyrics as they were encoded
so, I don't know if perl version already does it....
http://pastebin.com/f19836eee
Report
ultramancool
13 years ago
Report
mattepiu
13 years ago
1)DEPENDANCIES
Is Entities really needed?
I have issues with it, so I translated this script in python, and I'm using
.encode('ascii', 'xmlcharrefreplace') to get that functionality.
(well, I had to work a bit more cause python implementation of mechanize
is different)
Is there a way to reduce dependancies?
That would make more likely a substitution of the basic script
(which is the only ruby script I use and I'd like to wipe ruby completely)
P.S.: In both google and scroogle
you can restrict your search to domains:
before your query add
"site:www.azlyrics.com OR site:www.lyrics.org OR ..."
and you'll get results only from those domains...
Report
neoeno
13 years ago
My changes as I remember them:
- Adding songmeanings support
- Adding wearethelyrics support (I basically just added whatever site it took to make the script pick up my lyrics)
- I did add sing365, but you added that in the latest release anyway, so I let yours take precedence.
- Altering the search string slightly, I added 'intitle:' before the title of the song, might want to do that with the artist too. The idea was that pretty much all lyrics sites have the songTitle in the title, so it helps to weed out useless results.
A suggestion:
For a while I added the sites available to the search string (e.g. 'site:songmeanings.net OR site:sing365.com ...'), but I took it out because it was making the search query rather long.. and sometimes it would cause non-lyrics pages of the site to come up first. A bit more investigation into this might optimise the script considerably. Perhaps multiple scroogle queries would be helpful?
Anyway, here's my current script:
http://pastebin.ca/829747 (perl syntax highlighting seems to be broke'd on that site)
Report
ultramancool
13 years ago
Report
neoeno
13 years ago
Ah, yes, using inurl would probably be better.
You considered talking to the Amarok devs about this? Seems like with a bit of expansion it would be a shoe-in for the default script, since it's a meta-search and it's not hopelessly slow like the other one...
Report