isi2bibtex.rb version 0.6
西松と申します.
以前投稿しましたisi2bibtex.rbを改良しました.
Austin Zieglerさんの Text::Format
<http://www.halostatue.ca/ruby/Text__Format.html>
を使ってAbstractなどが整形されるようになりました.
>>>6月7日のぼくの記事:
> はじめてバグレポートをもらったので, isi2bibtex.rb をバージョンアップ
> しました. isi2bibtex.rb はISI社の論文データベースのWeb of Scienceの
> タグのついた出力ファイルをBibTeX形式に変換するRubyスクリプトです.
> とても短く書けました. 試してみて下さい.
love && peace && free_software
西松タケシ
ーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーー
#!/usr/bin/env ruby
=begin
= isi2bibtex.rb - convert ISI Export Format to BibTeX Format.
== What is isi2bibtex.rb?
isi2bibtex.rb converts ISI Export Format to BibTeX Format.
This is a Ruby script.
You can get the tagged Marked List in Web of Science by pushing the
[SAVE TO FILE] button.
== Copying
isi2bibtex.rb is distributed in the hope that
it will be useful, but WITHOUT ANY WARRANTY.
You can copy, modify and redistribute isi2bibtex.rb,
but only under the conditions described in
the GNU General Public License (the "GPL").
== Who is the author?
NISHIMATSU Takeshi <t-nissie{at}imr.tohoku.ac.jp>
== Why did he write it?
Because he do not like the output format of the Perl version.
== Is there a Perl version?
Yes.
You can find the Perl version by Jonathan Swinton, Ben Bolker, Anthony Stone, John J. Lee
((<"in CTAN"|URL:http://ring.tains.tohoku.ac.jp/archives/text/CTAN/biblio/bibtex/utils/isi2bibtex/>)).
== Where can I get isi2bibtex.rb?
Please download isi2bibtex.rb from:
((<URL:http://www-lab.imr.tohoku.ac.jp/%7Et-nissie/computer/software/isi2bibtex/isi2bibtex.rb>))
== How can I use it?
(1) Mark the articles in ISI Web of Science.
(2) View and save the marked records to an output file (savedrecs.txt).
I recommend to check "Author(s)", "Title", "Source", "abstract*",
"abstract", "keywords" and "source abbreviation" as the fields to
include in the output file.
(3) Then, here are some examples:
% ruby isi2bibtex.rb savedrecs.txt
% ruby isi2bibtex.rb savedrecs1.txt savedrecs2.txt > savedrecs.bib
% ruby isi2bibtex.rb < savedrecs.txt > savedrecs.bib
% cat savedrecs.txt | ./isi2bibtex.rb > savedrecs.bib
== I do not like the output format of isi2bibtex.rb, neither!
The output format is defined in the source code WYSIWYGly.
So you can change by yourself easily.
== ChangeLog
=== 2004-06-18
* ISI_record#fmt()
* version 0.6 is released!
=== 2004-06-17
* require ((<"Text::Format"|URL:http://www.halostatue.ca/ruby/Text__Format.html>))
by Austin Ziegler.
* Title, Keywords, NewKeywords, and Abstract are nicely formated into fixed-width.
* version 0.5 is released!
=== 2004-06-09
* simplified.
* tags are sorted.
* Reports "Filename:LineNumber: ..." when unknown tags are found.
* version 0.4
=== 2004-06-07
* Format of ref_name is changed to author:[authors:]journal:volume:page:year.
* Names of authors such like "de Haas, WJ" and "van Alphen, PM" are now available.
* AR tag (article number of new APS journals) is now available.
* It is O.K. in the case of "BP art. no., EP 125111".
* version 0.3 is released!
=== 2002-06-28
* version 0.2 is released!
== Meanings of tags in ISI Export Format:
See ((<URL:http://isibasic.com/help/helpprn.html>))
=== file-unique tags
FN: File type. The file starts with 'FN ISI Export Format'
VR: Version number of ISI export file format
EF: End of file
=== normal tags
AB: Abstract
AR: Article number of new APS journals
AU: Authors
BP: Beginning page
C1: Research addresses
CR: Cited references
DE: Original keywords
DT: Document type
EP: Ending page
ER: end of a record
GA: ISI document delivery number
ID: New keywords given by ISI
IS: issue
J9: 29-character journal title abbreviation
JI: ISO journal title abbreviation
LA: Language
NR: Cited reference count
PD: Publication date e.g. "JUN 8" or "JUL"
PG: the number of pages
PI: Publisher city
PN: Part number
PT: Publication type (e.g., book, journal, book in series)
PU: Publisher
PY: Publication year
RP: Reprint address
SE: Book series title
SI: Special issue
SN: ISSN
SO: journal title, in full
SU: Supplement
TC: Times cited
TI: Title
UT: ISI unique article identifier
VL: Volume
WP: Publisher web address
== Known bugs
* none.
== TODO
* Write papers, not tools for writing papers.
== References
* ((<The BibTeX Format|URL:http://www.ecst.csuchico.edu/~jacobsd/bib/formats/bibtex.html>))
* ((<Bibliography (BibTeX) Tools|URL:http://www.ecst.csuchico.edu/~jacobsd/bib/tools/bibtex.html>))
== Thanks to contributer(s)!
* Marcin Dulak
=end
ISI2BIBTEX_RB_VERSION = "0.6"
require 'text/format'
class ISI_record
public
def initialize(hash)
@hash = hash
end
def to_bibtex
return nil if self==nil
"@ARTICLE{#{ref_name},
Author = {#{and_separated_authors}},
Title = {#{fmt('TI')}},
Journal = {#{@hash['JI']}},
JournalFull= {#{@hash['SO']}},
Year = {#{@hash['PY']}},
Month = {#{month}},
Volume = {#{@hash['VL']}},
Number = {#{@hash['IS']}},
Pages = {#{pages}},
Keywords = {#{fmt('DE')}},
NewKeywords= {#{fmt('ID')}},
Abstract = {#{fmt('AB')}},
URL = {},
MyComment = {},
WhereIFiledIt= {}}\n\n"
end
private
FMT = Text::Format.new(:columns => 80, :first_indent => 1, :left_margin => 22)
def fmt(tag)
FMT.paragraphs(@hash[tag]).sub(/ /,'')
end # This Srting#sub is to discard an indent at the first line.
def pages
if @hash['AR']
return @hash['AR'] # article number of new APS journals
elsif @hash['BP'] =~ /^art/
return @hash['EP'] # in the case of "BP art. no., EP 125111"
elsif @hash['EP'] =~ /^\w?\d+/
return @hash['BP'] + '-' + @hash['EP'] # in the cases of "EP 1234" or "EP L567"
else
return @hash['BP']
end
end
def month
if @hash['PD'] =~ /^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)/i
return $1.upcase
else
return ''
end
end
def and_separated_authors
au = ''
@hash['AU'].each_with_index do |name,i|
au += ' and ' if i>0
if name =~ /, /
family_name = $`
initials = $'
au += initials.scan(/\w/).join(". ") + '. ' # "ABC" -> "A. B. C. "
au += family_name
else
au += name
end
end
return au
end
def ref_name
rn = ''
@hash['AU'].each_with_index do |name,i|
if i==0
if name =~ /, /
rn << $` << ':' # Take the component before /, /
else
rn << name << ':' # Take the whole name
end
else
rn << name[0,1] << ':' # Take the first character of the name
end
end
rn << @hash['JI'].to_s << ':'<< @hash['VL'].to_s << ':p'<< pages << ':'<< @hash['PY'].to_s
return rn.gsub(/\. */,'').gsub(/ +/,'').sub(/PhysRev(A|B|C|D|E)/,'PR\1').sub(/PhysRevLett/,'PRL')
end
end
class Object
def read_an_ISI_record # for ARGF
hash = {}
while line = gets
#===== a few special cases
return ISI_record.new(hash) if line =~ /^ER/
next if line =~ /^(EF|FN|VR)/ # ignore file-unique tags
next if line =~ /^\s*$/ # ignore blank lines
while line =~ /\!$/
line.chomp!.chop! # continued to next line if the line ends with "!"
line << gets
end
#===== Normal tags
case line
when /^AU /
authors = [line.chomp.sub(/^AU /,'')]
while (line = gets) =~ /^ /
authors.push(line.chomp.sub(/^ /,''))
end
hash['AU'] = authors
redo
when /^(TC|PG|PY) (.*)$/
hash[$1] = $2.to_i
when /^(TI|AB|DE|ID|SO|PT|JI|BP|EP|AR|PD|VL|IS|GA|PI|PU|PN|PA|J9|UT|DT|C1|RP|SI|SE|SU) (.*)$/
tag = $1
str = $2
while (line = gets) =~ /^ /
str << line.chomp.sub(/^ /,'')
end
hash[tag] = str
redo
else
STDERR << "#{$FILENAME}:#{file.lineno}: Unknown tag: #{line}"
end
end
return nil
end
end
# MAIN LOOP
while isi = ARGF.read_an_ISI_record
print isi.to_bibtex
end
Fnews-brouse 1.9(20180406) -- by Mizuno, MWE <mwe@ccsf.jp>
GnuPG Key ID = ECC8A735
GnuPG Key fingerprint = 9BE6 B9E9 55A5 A499 CD51 946E 9BDC 7870 ECC8 A735