Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
Documentation
module Data.String
String
is an alias for a list of characters.
String constants in Haskell are values of type String
.
That means if you write a string literal like "hello world"
,
it will have the type [Char]
, which is the same as String
.
Note: You can ask the compiler to automatically infer different types
with the -XOverloadedStrings
language extension, for example
"hello world" :: Text
. See IsString
for more information.
Because String
is just a list of characters, you can use normal list functions
to do basic string manipulation. See Data.List for operations on lists.
Performance considerations
[Char]
is a relatively memory-inefficient type.
It is a linked list of boxed word-size characters, internally it looks something like:
╭─────┬───┬──╮ ╭─────┬───┬──╮ ╭─────┬───┬──╮ ╭────╮ │ (:) │ │ ─┼─>│ (:) │ │ ─┼─>│ (:) │ │ ─┼─>│ [] │ ╰─────┴─┼─┴──╯ ╰─────┴─┼─┴──╯ ╰─────┴─┼─┴──╯ ╰────╯ v v v 'a' 'b' 'c'
The String
"abc" will use 5*3+1 = 16
(in general 5n+1
)
words of space in memory.
Furthermore, operations like (++)
(string concatenation) are O(n)
(in the left argument).
For historical reasons, the base
library uses String
in a lot of places
for the conceptual simplicity, but library code dealing with user-data
should use the text
package for Unicode text, or the the
bytestring package
for binary data.
Splits the argument into a list of lines stripped of their terminating
\n
characters. The \n
terminator is optional in a final non-empty
line of the argument string.
For example:
>>>
lines "" -- empty input contains no lines
[]>>>
lines "\n" -- single empty line
[""]>>>
lines "one" -- single unterminated line
["one"]>>>
lines "one\n" -- single non-empty line
["one"]>>>
lines "one\n\n" -- second line is empty
["one",""]>>>
lines "one\ntwo" -- second line is unterminated
["one","two"]>>>
lines "one\ntwo\n" -- two non-empty lines
["one","two"]
When the argument string is empty, or ends in a \n
character, it can be
recovered by passing the result of lines
to the unlines
function.
Otherwise, unlines
appends the missing terminating \n
. This makes
unlines . lines
idempotent:
(unlines . lines) . (unlines . lines) = (unlines . lines)