Using textutil on the mac terminal

 

Apps-utilities-terminal-icon

The following was originally found on https://developer.apple.com

https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/textutil.1.html

 

NAME
textutil — text utility

SYNOPSIS
textutil [command_option] [other_options] file …

DESCRIPTION
textutil can be used to manipulate text files of various formats, using the mechanisms provided by the Cocoa text system.

The first argument indicates the operation to perform, one of:

-help Show the usage information for the command and exit. This is the default command option if none is specified.

-info Display information about the specified files.

-convert fmt Convert the specified files to the indicated format and write each one back to the file system.

-cat fmt Read the specified files, concatenate them, and write the result out as a single file in
the indicated format.

fmt is one of: txt, html, rtf, rtfd, doc, docx, wordml, odt, or webarchive

There are some additional options for general use:

-extension ext Specify an extension to be used for output files (by default, the extension will be
determined from the format).

-output path Specify the file name to be used for the first output file.

-stdin Specify that input should be read from stdin rather than from files.

-stdout Specify that the first output file should go to stdout.

-encoding IANA_name | NSStringEncoding
Specify the encoding to be used for plain text or HTML output files (by default, the output encoding will be UTF-8). NSStringEncoding refers to one of the numeric values recognized by NSString. IANA_name refers to an IANA character set name as understood by CFString. The operation will fail if the file cannot be converted to the specified encoding.

-inputencoding IANA_name | NSStringEncoding
Force all plain text input files to be interpreted using the specified encoding (by default, a file’s encoding will be determined from its BOM). The operation will fail if the file cannot be interpreted using the specified encoding.

-format fmt Force all input files to be interpreted using the indicated format (by default, a
file’s format will be determined from its contents).

-font font Specify the name of the font to be used for converting plain to rich text.

-fontsize size Specify the size in points of the font to be used for converting plain to rich text.

— Specify that all further arguments are file names.

There are some additional options for HTML and WebArchive files:

-noload Do not load subsidiary resources.

-nostore Do not write out subsidiary resources.

-baseurl url Specify a base URL to be used for relative URLs.

-timeout t Specify the time in seconds to wait for resources to load.

-textsizemultiplier x
Specify a numeric factor by which to multiply font sizes.

-excludedelements (tag1, tag2, …)
Specify which HTML elements should not be used in generated HTML (the list should be a
single argument, and so will usually need to be quoted in a shell context).

-prefixspaces n Specify the number of spaces by which to indent nested elements in generated HTML
(default is 2).

There are some additional options for treating metadata:

-strip Do not copy metadata from input files to output files.

-title val Specify the title metadata attribute for output files.

-author val Specify the author metadata attribute for output files.

-subject val Specify the subject metadata attribute for output files.

-keywords (val1, val2, …)
Specify the keywords metadata attribute for output files (the list should be a single
argument, and so will usually need to be quoted in a shell context).

-comment val Specify the comment metadata attribute for output files.

-editor val Specify the editor metadata attribute for output files.

-company val Specify the company metadata attribute for output files.

-creationtime yyyy-mm-ddThh:mm:ssZ
Specify the creation time metadata attribute for output files.

-modificationtime yyyy-mm-ddThh:mm:ssZ
Specify the modification time metadata attribute for output files.

EXAMPLES
textutil -info foo.rtf

displays information about foo.rtf.

textutil -convert html foo.rtf

converts foo.rtf into foo.html.

textutil -convert rtf -font Times -fontsize 10 foo.txt

converts foo.txt into foo.rtf, using Times 10 for the font.

textutil -cat html -title “Several Files” -output index.html *.rtf

loads all RTF files in the current directory, concatenates their contents, and writes the result out as
index.html with the HTML title set to “Several Files”.

DIAGNOSTICS
The textutil command exits 0 on success, and 1 on failure.

Collocate 1.0 for Windows free download

 

Graph-Magnifier-icon

Through the Corpora list

:::::::::::::::::::::::::::::::::::::

The Windows program Collocate 1.0 is available for download at michaelbarlow.com. The software extracts collocations in various ways (including n-grams) and provides scores based on MI, t-score and Log Likelihood. This version of the software is now free.

If you want to simply check out the functionality of the software, or have misplaced your manual, you can get it from

https://auckland.academia.edu/MichaelBarlow

 

 

#corpusMOOC Corpus Linguistics: Method, Analysis, Interpretation starts Sept 29

futurelearnlogo

This free MOOC Offers practical introduction to the methodology of corpus linguistics for researchers in social sciences and humanities. It is an 8-week course and is run by Lancaster University.

More information here.