isi2bibtex.rb version 0.6

From(投稿者):	NISHIMATSU Takeshi <t-nissie@No-spam.imr.tohoku.ac.jp>
Newsgroups(投稿グループ):	fj.sources,fj.comp.lang.ruby,fj.comp.texhax
Followup-to(フォローアップ記事の投稿先指定):	fj.sources.d
Subject(見出し):	isi2bibtex.rb version 0.6
Date(投稿日時):	23 Jun 2004 10:53:04 +0900
Organization(所属):	Tohoku Univ InterNetNews Site
Message-ID(記事識別符号):	(G) <yeyvfhjhwfz.fsf@cms26.imr.tohoku.ac.jp>

From(投稿者):

NISHIMATSU Takeshi <t-nissie@No-spam.imr.tohoku.ac.jp>

Newsgroups(投稿グループ):

fj.sources,fj.comp.lang.ruby,fj.comp.texhax

Followup-to(フォローアップ記事の投稿先指定):

fj.sources.d

Subject(見出し):

isi2bibtex.rb version 0.6

Date(投稿日時):

23 Jun 2004 10:53:04 +0900

Organization(所属):

Tohoku Univ InterNetNews Site

Message-ID(記事識別符号):

(G) <yeyvfhjhwfz.fsf@cms26.imr.tohoku.ac.jp>

記事全体へのコマンド

西松と申します.

以前投稿しましたisi2bibtex.rbを改良しました.
Austin Zieglerさんの Text::Format
<http://www.halostatue.ca/ruby/Text__Format.html>
を使ってAbstractなどが整形されるようになりました.

>>>６月７日のぼくの記事:
> はじめてバグレポートをもらったので, isi2bibtex.rb をバージョンアップ
> しました. isi2bibtex.rb はISI社の論文データベースのWeb of Scienceの
> タグのついた出力ファイルをBibTeX形式に変換するRubyスクリプトです.
> とても短く書けました. 試してみて下さい.

   love && peace && free_software
   西松タケシ

ーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーーー
#!/usr/bin/env ruby
=begin
= isi2bibtex.rb - convert ISI Export Format to BibTeX Format.
== What is isi2bibtex.rb?
isi2bibtex.rb converts ISI Export Format to BibTeX Format.
This is a Ruby script.

You can get the tagged Marked List in Web of Science by pushing the
[SAVE TO FILE] button.

== Copying
isi2bibtex.rb is distributed in the hope that
it will be useful, but WITHOUT ANY WARRANTY.
You can copy, modify and redistribute isi2bibtex.rb,
but only under the conditions described in
the GNU General Public License (the "GPL").

== Who is the author?
NISHIMATSU Takeshi <t-nissie{at}imr.tohoku.ac.jp>

== Why did he write it?
Because he do not like the output format of the Perl version.

== Is there a Perl version?
Yes.
You can find the Perl version by Jonathan Swinton, Ben Bolker, Anthony Stone, John J. Lee
((<"in CTAN"|URL:http://ring.tains.tohoku.ac.jp/archives/text/CTAN/biblio/bibtex/utils/isi2bibtex/>)).

== Where can I get isi2bibtex.rb?
Please download isi2bibtex.rb from:
((<URL:http://www-lab.imr.tohoku.ac.jp/%7Et-nissie/computer/software/isi2bibtex/isi2bibtex.rb>))

== How can I use it?
(1) Mark the articles in ISI Web of Science.
(2) View and save the marked records to an output file (savedrecs.txt).
    I recommend to check "Author(s)", "Title", "Source", "abstract*",
    "abstract", "keywords" and "source abbreviation" as the fields to
    include in the output file.
(3) Then, here are some examples:
 % ruby isi2bibtex.rb savedrecs.txt
 % ruby isi2bibtex.rb savedrecs1.txt savedrecs2.txt > savedrecs.bib
 % ruby isi2bibtex.rb < savedrecs.txt > savedrecs.bib
 % cat savedrecs.txt | ./isi2bibtex.rb > savedrecs.bib

== I do not like the output format of isi2bibtex.rb, neither!
The output format is defined in the source code WYSIWYGly.
So you can change by yourself easily.

== ChangeLog
=== 2004-06-18
* ISI_record#fmt()
* version 0.6 is released!

=== 2004-06-17
* require ((<"Text::Format"|URL:http://www.halostatue.ca/ruby/Text__Format.html>))
  by Austin Ziegler.
* Title, Keywords, NewKeywords, and Abstract are nicely formated into fixed-width.
* version 0.5 is released!

=== 2004-06-09
* simplified.
* tags are sorted.
* Reports "Filename:LineNumber: ..." when unknown tags are found.
* version 0.4

=== 2004-06-07
* Format of ref_name is changed to author:[authors:]journal:volume:page:year.
* Names of authors such like "de Haas, WJ" and "van Alphen, PM" are now available.
* AR tag (article number of new APS journals) is now available.
* It is O.K. in the case of "BP art. no., EP 125111".
* version 0.3 is released!

=== 2002-06-28
* version 0.2 is released!

== Meanings of tags in ISI Export Format:
See ((<URL:http://isibasic.com/help/helpprn.html>))
=== file-unique tags
 FN: File type. The file starts with 'FN ISI Export Format'
 VR: Version number of ISI export file format
 EF: End of file
=== normal tags
 AB: Abstract
 AR: Article number of new APS journals
 AU: Authors
 BP: Beginning page
 C1: Research addresses
 CR: Cited references
 DE: Original keywords
 DT: Document type
 EP: Ending page
 ER: end of a record
 GA: ISI document delivery number
 ID: New keywords given by ISI
 IS: issue
 J9: 29-character journal title abbreviation
 JI: ISO journal title abbreviation
 LA: Language
 NR: Cited reference count
 PD: Publication date e.g. "JUN 8" or "JUL"
 PG: the number of pages
 PI: Publisher city
 PN: Part number
 PT: Publication type (e.g., book, journal, book in series)
 PU: Publisher
 PY: Publication year
 RP: Reprint address
 SE: Book series title
 SI: Special issue
 SN: ISSN
 SO: journal title, in full
 SU: Supplement
 TC: Times cited
 TI: Title
 UT: ISI unique article identifier
 VL: Volume
 WP: Publisher web address

== Known bugs
* none.

== TODO
* Write papers, not tools for writing papers.

== References
* ((<The BibTeX Format|URL:http://www.ecst.csuchico.edu/~jacobsd/bib/formats/bibtex.html>))
* ((<Bibliography (BibTeX) Tools|URL:http://www.ecst.csuchico.edu/~jacobsd/bib/tools/bibtex.html>))

== Thanks to contributer(s)!
* Marcin Dulak
=end
ISI2BIBTEX_RB_VERSION = "0.6"
require 'text/format'
class ISI_record
public
  def initialize(hash)
    @hash = hash
  end

  def to_bibtex
    return nil if self==nil
    "@ARTICLE{#{ref_name},
        Author     = {#{and_separated_authors}},
        Title      = {#{fmt('TI')}},
        Journal    = {#{@hash['JI']}},
        JournalFull= {#{@hash['SO']}},
        Year       = {#{@hash['PY']}},
        Month      = {#{month}},
        Volume     = {#{@hash['VL']}},
        Number     = {#{@hash['IS']}},
        Pages      = {#{pages}},
        Keywords   = {#{fmt('DE')}},
        NewKeywords= {#{fmt('ID')}},
        Abstract   = {#{fmt('AB')}},
        URL        = {},
        MyComment  = {},   
        WhereIFiledIt= {}}\n\n"
  end

private
  FMT = Text::Format.new(:columns => 80, :first_indent => 1, :left_margin => 22)
  def fmt(tag)
    FMT.paragraphs(@hash[tag]).sub(/                       /,'')
  end            # This Srting#sub is to discard an indent at the first line.

  def pages
    if @hash['AR']
      return @hash['AR']   # article number of new APS journals
    elsif @hash['BP'] =~ /^art/
      return @hash['EP']   # in the case of "BP art. no., EP 125111"
    elsif @hash['EP'] =~ /^\w?\d+/
      return @hash['BP'] + '-' + @hash['EP']   # in the cases of "EP 1234" or "EP L567"
    else
      return @hash['BP']
    end
  end

  def month
    if @hash['PD'] =~ /^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)/i
      return $1.upcase
    else
      return ''
    end
  end

  def and_separated_authors
    au = ''
    @hash['AU'].each_with_index do |name,i|
        au += ' and ' if i>0
      if name =~ /, /
        family_name = $`
        initials = $'
        au += initials.scan(/\w/).join(". ") + '. '  # "ABC" -> "A. B. C. "
        au += family_name
      else
        au += name
      end
    end
    return au
  end

  def ref_name
    rn = ''
    @hash['AU'].each_with_index do |name,i| 
      if i==0
        if name =~ /, /
          rn << $` << ':'       # Take the component before /, /
        else
          rn << name << ':'     # Take the whole name
        end
      else
        rn << name[0,1] << ':'  # Take the first character of the name
      end
    end
    rn << @hash['JI'].to_s << ':'<< @hash['VL'].to_s << ':p'<< pages << ':'<< @hash['PY'].to_s
    return rn.gsub(/\. */,'').gsub(/ +/,'').sub(/PhysRev(A|B|C|D|E)/,'PR\1').sub(/PhysRevLett/,'PRL')
  end
end

class Object
  def read_an_ISI_record   # for ARGF
    hash = {}
    while line = gets
      #===== a few special cases
      return ISI_record.new(hash) if line =~ /^ER/
      next if line =~ /^(EF|FN|VR)/   # ignore file-unique tags
      next if line =~ /^\s*$/         # ignore blank lines
      while line =~ /\!$/
        line.chomp!.chop!   # continued to next line if the line ends with "!"
        line << gets
      end
      #===== Normal tags
      case line
      when /^AU /
        authors = [line.chomp.sub(/^AU /,'')]
        while (line = gets) =~ /^   /
          authors.push(line.chomp.sub(/^   /,''))
        end
        hash['AU'] = authors
        redo
      when /^(TC|PG|PY) (.*)$/
        hash[$1] = $2.to_i
      when /^(TI|AB|DE|ID|SO|PT|JI|BP|EP|AR|PD|VL|IS|GA|PI|PU|PN|PA|J9|UT|DT|C1|RP|SI|SE|SU) (.*)$/
        tag = $1
        str = $2
        while (line = gets) =~ /^   /
          str << line.chomp.sub(/^  /,'')
        end
        hash[tag] = str
        redo
      else
        STDERR << "#{$FILENAME}:#{file.lineno}: Unknown tag: #{line}"
      end
    end
    return nil
  end
end

# MAIN LOOP
while isi = ARGF.read_an_ISI_record
  print isi.to_bibtex
end

Fnews-brouse 1.9(20180406) -- by Mizuno, MWE <mwe@ccsf.jp>
GnuPG Key ID = ECC8A735
GnuPG Key fingerprint = 9BE6 B9E9 55A5 A499 CD51 946E 9BDC 7870 ECC8 A735