D
Language
Phobos
Comparisons
object
std
std.base64
std.boxer
std.compiler
std.conv
std.cover
std.ctype
std.date
std.demangle
std.file
std.format
std.gc
std.intrinsic
std.math
std.md5
std.mmfile
std.openrj
std.outbuffer
std.path
std.process
std.random
std.recls
std.regexp
std.socket
std.socketstream
std.stdint
std.stdio
std.cstream
std.stream
std.string
std.system
std.thread
std.uri
std.utf
std.zip
std.zlib
std.c.fenv
std.c.math
std.c.process
std.c.stdarg
std.c.stddef
std.c.stdio
std.c.stdlib
std.c.string
std.c.time
std.c.wcharh
std.windows.charset
std.windows
std.linux
std.c.windows
std.c.linux
|
std.string
String handling functions.
To copy or not to copy?
When a function takes a string as a parameter, and returns a string,
is that string the same as the input string, modified in place, or
is it a modified copy of the input string? The D array convention is
"copy-on-write". This means that if no modifications are done, the
original string (or slices of it) can be returned. If any modifications
are done, the returned string is a copy.
- class StringException: object.Exception;
- Thrown on errors in string functions.
- const char[16] hexdigits;
- 0..9A..F
- const char[10] digits;
- 0..9
- const char[8] octdigits;
- 0..7
- const char[26] lowercase;
- a..z
- const char[26] uppercase;
- A..Z
- const char[52] letters;
- A..Za..z
- const char[6] whitespace;
- ASCII whitespace
- const dchar LS;
- UTF line separator
- const dchar PS;
- UTF paragraph separator
- const char[2] newline;
- Newline sequence for this system
- int iswhite(dchar c);
- Returns !=0 if c is whitespace
- long atoi(char[] s);
- Convert string to integer.
- real atof(char[] s);
- Convert string to real.
- int cmp(char[] s1, char[] s2);
int icmp(char[] s1, char[] s2);
- Compare two strings. cmp is case sensitive, icmp is case insensitive.
Returns:
< 0 | s1 < s2
| = 0 | s1 == s2
| > 0 | s1 > s2
|
- char* toStringz(char[] s);
- Convert array of chars s[] to a C-style 0 terminated string.
- int find(char[] s, dchar c);
int ifind(char[] s, dchar c);
int rfind(char[] s, dchar c);
int irfind(char[] s, dchar c);
- find, ifind find first occurrance of c in string s.
rfind, irfind find last occurrance of c in string s.
find, rfind are case sensitive; ifind, irfind are case insensitive.
Returns:
Index in s where c is found, -1 if not found.
- int find(char[] s, char[] sub);
int ifind(char[] s, char[] sub);
int rfind(char[] s, char[] sub);
int irfind(char[] s, char[] sub);
- find, ifind find first occurrance of sub[] in string s[].
rfind, irfind find last occurrance of sub[] in string s[].
find, rfind are case sensitive; ifind, irfind are case insensitive.
Returns:
Index in s where c is found, -1 if not found.
- char[] tolower(char[] s);
- Convert string s[] to lower case.
- char[] toupper(char[] s);
- Convert string s[] to upper case.
- char[] capitalize(char[] s);
- Capitalize first character of string s[], convert rest of string s[]
to lower case.
- char[] capwords(char[] s);
- Capitalize all words in string s[].
Remove leading and trailing whitespace.
Replace all sequences of whitespace with a single space.
- char[] repeat(char[] s, uint n);
- Return a string that consists of s[] repeated n times.
- char[] join(char[][] words, char[] sep);
- Concatenate all the strings in words[] together into one
string; use sep[] as the separator.
- char[][] split(char[] s);
- Split s[] into an array of words,
using whitespace as the delimiter.
- char[][] split(char[] s, char[] delim);
- Split s[] into an array of words,
using delim[] as the delimiter.
- char[][] splitlines(char[] s);
- Split s[] into an array of lines,
using CR, LF, or CR-LF as the delimiter.
The delimiter is not included in the line.
- char[] stripl(char[] s);
char[] stripr(char[] s);
char[] strip(char[] s);
- Strips leading or trailing whitespace, or both.
- char[] chomp(char[] s, char[] delimiter = null);
- Returns s[] sans trailing delimiter[], if any.
If delimiter[] is null, removes trailing CR, LF, or CRLF, if any.
- char[] chop(char[] s);
- Returns s[] sans trailing character, if there is one.
If last two characters are CR-LF, then both are removed.
- char[] ljustify(char[] s, int width);
char[] rjustify(char[] s, int width);
char[] center(char[] s, int width);
- Left justify, right justify, or center string s[]
in field width chars wide.
- char[] zfill(char[] s, int width);
- Same as rjustify(), but fill with '0's.
- char[] replace(char[] s, char[] from, char[] to);
- Replace occurrences of from[] with to[] in s[].
- char[] replaceSlice(char[] string, char[] slice, char[] replacement);
- Return a string that is string[] with slice[] replaced by replacement[].
- char[] insert(char[] s, uint index, char[] sub);
- Insert sub[] into s[] at location index.
- uint count(char[] s, char[] sub);
- Count up all instances of sub[] in s[].
- char[] expandtabs(char[] string, int tabsize = 8);
- Replace tabs with the appropriate number of spaces.
tabsize is the distance between tab stops.
- char[] entab(char[] string, int tabsize = 8);
- Replace spaces in string with the optimal number of tabs.
Trailing spaces or tabs in a line are removed.
Params:
char[] string |
String to convert. |
int tabsize |
Tab columns are tabsize spaces apart. tabsize defaults to 8. |
- char[] maketrans(char[] from, char[] to);
- Construct translation table for translate().
BUG:
only works with ASCII
- char[] translate(char[] s, char[] transtab, char[] delchars);
- Translate characters in s[] using table created by maketrans().
Delete chars in delchars[].
BUG:
only works with ASCII
- char[] toString(bool b);
char[] toString(char c);
char[] toString(ubyte ub);
char[] toString(ushort us);
char[] toString(uint u);
char[] toString(ulong u);
char[] toString(byte b);
char[] toString(short s);
char[] toString(int i);
char[] toString(long i);
char[] toString(float f);
char[] toString(double d);
char[] toString(real r);
char[] toString(ifloat f);
char[] toString(idouble d);
char[] toString(ireal r);
char[] toString(cfloat f);
char[] toString(cdouble d);
char[] toString(creal r);
- Convert to char[].
- char[] toString(long value, uint radix);
char[] toString(ulong value, uint radix);
- Convert value to string in radix radix.
radix must be a value from 2 to 36.
value is treated as a signed value only if radix is 10.
The characters A through Z are used to represent values 10 through 36.
- char[] toString(char* s);
- Convert C-style 0 terminated string s to char[] string.
- char[] format(...);
- Format arguments into a string.
- char[] sformat(char[] s,...);
- Format arguments into string s which must be large
enough to hold the result. Throws ArrayBoundsError if it is not.
Returns:
s
- int inPattern(dchar c, char[] pattern);
- See if character c is in the pattern.
Patterns:
A pattern is an array of characters much like a character
class in regular expressions. A sequence of characters
can be given, such as "abcde". The '-' can represent a range
of characters, as "a-e" represents the same pattern as "abcde".
"a-fA-F0-9" represents all the hex characters.
If the first character of a pattern is '^', then the pattern
is negated, i.e. "^0-9" means any character except a digit.
The functions inPattern, countchars, removeschars,
and squeeze
use patterns.
Note:
In the future, the pattern syntax may be improved
to be more like regular expression character classes.
- int inPattern(dchar c, char[][] patterns);
- See if character c is in the intersection of the patterns.
- uint countchars(char[] s, char[] pattern);
- Count characters in s that match pattern.
- char[] removechars(char[] s, char[] pattern);
- Return string that is s with all characters removed that match pattern.
- char[] squeeze(char[] s, char[] pattern = null);
- Return string where sequences of a character in s[] from pattern[]
are replaced with a single instance of that character.
If pattern is null, it defaults to all characters.
- char[] succ(char[] s);
- Return string that is the 'successor' to s[].
If the rightmost character is a-zA-Z0-9, it is incremented within
its case or digits. If it generates a carry, the process is
repeated with the one to its immediate left.
- char[] tr(char[] str, char[] from, char[] to, char[] modifiers = null);
- Replaces characters in str[] that are in from[]
with corresponding characters in to[] and returns the resulting
string.
Params:
char[] modifiers |
a string of modifier characters |
Modifiers:
Modifier | Description
| c | Complement the list of characters in from[]
| d | Removes matching characters with no corresponding replacement in to[]
| s | Removes adjacent duplicates in the replaced characters
|
If modifier d is present, then the number of characters
in to[] may be only 0 or 1.
If modifier d is not present and to[] is null,
then to[] is taken to be the same as from[].
If modifier d is not present and to[] is shorter
than from[], then to[] is extended by replicating the
last character in to[].
Both from[] and to[] may contain ranges using the -
character, for example a-d is synonymous with abcd.
Neither accept a leading ^ as meaning the complement of
the string (use the c modifier for that).
- final bool isNumeric(char[] s, bool bAllowSep = false);
- [in] char[] s can be formatted in the following ways:
Integer Whole Number:
(for byte, ubyte, short, ushort, int, uint, long, and ulong)
['+'|'-']digit(s)[U|L|UL]
Examples:
123, 123UL, 123L, +123U, -123L
Floating-Point Number:
(for float, double, real, ifloat, idouble, and ireal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|inf|-inf]
Examples:
+123., -123.01, 123.3e-10f, 123.3e-10fi, 123.3e-10L
(for cfloat, cdouble, and creal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][+]
[digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|nan+nani|inf|-inf]
Examples:
nan, -123e-1+456.9e-10Li, +123e+10+456i, 123+456
[in] bool bAllowSep
False by default, but when set to true it will accept the
separator characters "," and "" within the string, but these
characters should be stripped from the string before using any
of the conversion functions like toInt(), toFloat(), and etc
else an error will occur.
Also please note, that no spaces are allowed within the string
anywhere whether it's a leading, trailing, or embedded space(s),
thus they too must be stripped from the string before using this
function, or any of the conversion functions.
- bool isNumeric(...);
- Allow any object as a parameter
- bool isNumeric(TypeInfo[] _arguments, void* _argptr);
- Check only the first parameter, all others will be ignored.
- char[] soundex(char[] string, char[] buffer = null);
- Soundex algorithm.
The Soundex algorithm converts a word into 4 characters
based on how the word sounds phonetically. The idea is that
two spellings that sound alike will have the same Soundex
value, which means that Soundex can be used for fuzzy matching
of names.
Params:
char[] string |
String to convert to Soundex representation. |
char[] buffer |
Optional 4 char array to put the resulting Soundex
characters into. If null, the return value
buffer will be allocated on the heap. |
Returns:
The four character array with the Soundex result in it.
Returns null if there is no Soundex representation for the string.
See Also:
Wikipedia,
The Soundex Indexing System
BUGS:
Only works well with English names.
There are other arguably better Soundex algorithms,
but this one is the standard one.
- char[][char[]] abbrev(char[][] values);
- Construct an associative array consisting of all
abbreviations that uniquely map to the strings in values.
This is useful in cases where the user is expected to type
in one of a known set of strings, and the program will helpfully
autocomplete the string once sufficient characters have been
entered that uniquely identify it.
- int column(char[] string, int tabsize = 8);
- Compute column number after string if string starts in the
leftmost column, which is numbered starting from 0.
- char[] wrap(char[] s, int columns = 80, char[] firstindent = null, char[] indent = null, int tabsize = 8);
- Wrap text into a paragraph.
|