Parameter expansion
这是在script 中处理string, number 数据的常用方法. 可以用来代替sed
, cut
这些external programs, speed up significantly. As our experience with scripting grows, the ability to effectively manipulate strings and numbers will prove extremely valuable.
对于变量值的检查(比如参数是否为dash 开头,否则当作argument使用),提取(比如提取一个文件名去掉后缀)很有帮助,这个手册概括了所有情况,但可能不好理解,可以动手试一下就知道了, Shell parameter expansion. 其实用pipeline 也可以达到相同的效果,但是会麻烦一些。
这是中文总结,还不错Shell扩展(Shell Expansions)-参数扩展(Shell Parameter Expansion). 有个地方写错了 $$
才是当前shell 的PID。
注意null
和 unset
variable的区别, set -u
可以检测报错使用没有定义的variable (也就是unset), null
在这里就是empty的意思,比如var=
,这个变量是存在的,只是没有值:
1 | var= |
Bash’s various forms of parameter expansion can also distinguish between unset and null values:
1 | # `w` can be literal or another variable |
Expansion that return variable names:
1 | # return variable name starts with prefix |
Indirect parameter expansion:
1 | parameter="var" |
Substring expansion,来自上面的链接中 Shell parameter expansion
的例子:
1 | $ string=01234567890abcdefgh |
其他常见的用法,主要是针对string 操作,特别是pathname, much faster than cut
extraction!!
1 | # get string length |
This can be also used to do substring contains checking:
1 | # empty the whole string if substring target is inside |
其实#
or ##
后面可以使用pattern matching, 这样功能更强, 比如:
1 | ${1#+(-)} |
参数替换, search and replace ${parameter/pattern/string}, replace pattern in parameter with string.
1 | foo=JPG.JPG |
Shell Globs
A glob is a wildcard that is processed by the shell and expands into a list of arguments.
Glob is like regular expression but less expressive and eaiser to use. Glob match file names, for example ls [0-9]?file*.txt
, whereas regular expression match text, for example ls | grep '[0-9].file.*\.txt'
. Sometimes the funtionality can look blurred depending on how you use it. 都可以用在if condition [[ =~ ]], case condition中.
In ls [0-9]?file*.txt
, ls
does not support regular expression, shell expands the glob and used by ls
.
grep '^A.*\.txt' *.txt
, grep is using regular expression on the files context that file name is expanded by shell from glob.
Shell expansion types and execution order (precedence high to low from up to bottom):
- brace expansion
touch file{1..2}
- tilde expansion
ls ~
- parameter and variable expansion
${1:1:1}, ${PATH}
- command substitution
$()
or `` - word splitting
- arithmetic expansion
echo $((11 + 22))
- filename expansion
echo file{1..2}.*
- quote removal
echo "$USER"
Wildcards
1 | ls *.txt |
Character Set
注意,在Linux中,根据locale
的设置,这个regular expression 其实是不包含a
的:
1 | ls /usr/sbin/[A-Z]* |
解决办法是用POSIX Character class(见下一节), this standards introduced a concept called a locale
, which could be adjusted to select the character set needed for a particular location. We can see the language setting of our system using the following command:
1 | echo $LANG |
注意和brace expansion {}
区别,brace expansion是展开,character set是一种match:
1 | # character set |
Character classes
1 | # [:upper:] is the character class |
Others class useful:
1 | # numbers |
Shell globbing Options
使用shopt
command的设置 glob的一些特性,比如设置nullglob, extglob, etc.
shopt -s extglob
, when using extended pattern matching operators. see here
shopt -s nocasematch
, set bash case-insensitive match in case
or [[ ]]
condition.
这个是从bash tutorial 中文版中学到的, shopt
is bash built-in setting, unlike set
is from POSIX.
Extended Globs
You need to open it:
1 | shopt | grep extglob |
For example, create test cases:
1 | touch file1.png photo.jpg photo photo.png file.png photo.png.jpg |
1 | # @(match): match one or others |
主要用在command line, if condition [[ =~ ]]
, case condition上,比regular expression matching更快。
Brace Expansion
这个用在比如for loop的counter, create file pre/suffix.
1 | # create file1.txt file2.txt file4.txt |
For easy copy and rename file:
1 | # match before and after , |
Regular Expression
注意regular expression 和 globs 的区别,regular expression 是match text的,globs是 shell来扩展的. 会使用到的地方:
- grep
- sed
- awk
- if [[ =~ ]]
- vim for search
- less for search
- find -regex
- locate -regex
Regular Expression Info
POSIX regular expression has basic regular expression(BRE
) and extended regular expression(ERE
).
ERE Syntax, note that ERE has () {} ? + |
expression that BRE does not:
.
matches one char[ ]
character set\
escape single char|
alternation: match to occur from among a set of expressions( )
pattern grouping, for example, separate|
with others:^(AA|BB|CC)+
? * + { }
repetition operators^abc
leading anchorabc$
trailing anchor[^abc]
netates pattern,^
must appear at beginning
Use ERE whenever possible!!! The support in GNU tools are:
grep -E
ERE,grep [-G]
default is BREsed -E
ERE,sed
default BREawk
only supports ERE[[ =~ ]]
ERE
1 | # match one char with zero to 3 occurances |
Backreferences
A pattern stored in a buffer to be recalled later, limit of nine for example: \1
to \9
:
1 | # \1 is (ss) pattern |
POSIX ERE does not support backreferences
, GNU version supports it.
Bash Extended Regexp
Used in [[ =~ ]]
in if condition, it is simple to write then extended globs but less efficiency.
BASH_REMATCH: regular expression match, the matched text is placed into array BASH_REMATCH
:
1 | [[ abcdef =~ b.d ]] |
${BASH_REMATCH[0]}
很有用,因为存了match的内容,如果是多个group ( )
pattern的组合,则每个group一次存放在${BASH_REMATCH[n]}
, n is 0/1/2/3…。
Grep EREs
Grep
is global regular expression print. Stick to grep -E 'xxxx'
.
grep -E -w
only match a whole word.
grep -E -x
only match whole line, same as using anchors.
grep -E -o
only return the text that match the expression.
grep -E -q
quiet mode, used to verfiy existence of search item, 用于以前没有[[ =~ ]]
的时候.
Sed EREs
See my Sed blog.
Awk EREs
Only support ERE by default, see my awk dedicated blog.