2009/10/2

Bring CC data to Zotero

Zotero is a Firefox extension which plays the role of EndNote; it helps you to manage the resources you collected from the web, like research papers or articles. Though there's a "Rights" field for copyright / license metadata, it seems that Zotero cannot recognize CC license data.

Zotero supports COinS, but doesn't save "rft.rights" values from OpenURL (I don't know why.) That means you have to enter every rights information by yourself, or you might have copyright issues when you wanna reuse the data.

Luckily we have MozCC, an extension to recognize hidden metadata in the web page. By the following way, we can make another extension to bring the power of MozCC to Zotero. It actually pretty simple, but I do spend way too much time coding to address these issues:
  1. After 2.4, MozCC is broken into two extension: One for recognize metadata in web page, and another for UI display. Since the separation is still at the early stage of development, I often feel lost when I read the code. :( Well I'm not a good programmer, I knew it from the very beginning.
  2. COinS is totally new for me, and there's only a few example for "rft.rights" on the Internet. This is pretty strange but it's true. Do a search and you'll see. I have to decide if it is the good place to save license data and, if true, in which format. Since I want to solve the multi-licenses issue (see below) in the next step, I save only the license URL at this point. 
  3. When there are two (or more) license statements on one page, how do you know which license to follow? Sometimes it means the page is dual-licensed (like Wikipedia, license under both GFDL and CC BY-SA,) but sometimes it means different licenses for different parts on the same page. This could be a problem, and even MozCC doesn't solve this.
  4. Bob is too lazy.
Anyway, I made a simple extension (or, "plugin for Zotero") to try out the possibility. If you wanna give it a try, or even help me to improve it, you can get it from my website.

Remember you have to install MozCC and Zotero first. You can use Zotero 1.x or 2.x beta, and for MozCC, please install the newest version IN THE SANDBOX (2.4.9+) from the Firefox Addons site.

Patches and feedback are welcomed, just leave me a message :)

練習寫英文的分隔線 (喔對了歡迎幫我改作文 orz 請寄信給我)

Zotero 是個 Firefox 的擴充套件,可以用來管理網路文件與你擷取的其他資料,跟 EndNote 的作用非常相近、都是比較偏向學術文件管理的用途。我家老闆希望有個可以管理網路 CC 資源的軟體,這當然是非常棒的東西,但可惜他雖有「授權」欄位、卻認不得 CC 的資料。

我查了一下他支援的 COinS 格式,此格式將 OpenURL 嵌入網頁,規格裡也有 rft.rights 這個欄位,不過 Zotero 不吃… 所以我把腦筋動到 MozCC、一個可以辨識網頁上 CC 資訊的套件上。理論上,可以寫個小程式,讓 Zotero 在儲存資料時「順手」將 MozCC 辨識出來的資訊帶到 相對應的欄位裡,完成任務。

程式其實很簡單… 不過我花的時間好像比較多在查資料,包括:
  1. MozCC 2.4.9 開始真的把這套件拆成兩個部份了,一個用來辨識、一個用來顯示。這很棒,也是正確方向,不過目前分割得不是很乾淨,我看程式碼時常迷路 orz 我真的很遜。
  2. COinS 跟 OpenURL 對我來說是新東西,其實我不太敢確認  rft.rights「可以」拿來存授權資料;目前反正是心一橫就上了,而且我有看到有人的使用方式跟我相近。另外,要儲存什麼資訊呢?我一開始有想過像 MP3 的授權欄位一樣,儲存「本文章採用創用CC姓名標示條款授權,請見 {網址}」這類的東西,不過考量到我想解決多重授權(看下一條)等問題,所以目前先單純儲存授權條款網址再說。
  3. 網頁上如果出現兩個(以上)的授權聲明,怎麼辨識儲存下來的部份該遵從哪一個條款?是平行授權、還是針對網頁的不同部份授予不同權利呢?這真的有難,MozCC 也沒解決這問題,不過我想試試看有沒有什麼好方法可以處理,以後再說。
  4. 我好懶。現在每天看棒球轉播是我唯一的嗜好。
總之總之,我還是做了一個小套件。先裝 Zotero 跟 MozCC 2.4.9+ 再裝這個套件,那麼儲存網頁時如果 MozCC 有偵測到 CC 授權資訊、Zotero 就會把授權條款資訊儲存在「權利」那個欄位內。有興趣的可以試試看,也超歡迎幫忙修改、一起討論的。

我覺得寫得簡單又爛,其實很掙扎要不要丟出來 orz 不過考量到其實我好像沒在怕別人知道我多遜,就丟吧 XD 有興趣幫改或提意見的,請直接留言囉。

沒有留言:

張貼留言

歡迎留下您的意見