v.scanner #
fn new_scanner #
fn new_scanner(text string, comments_mode CommentsMode, pref_ &pref.Preferences) &Scanner
new scanner from string.
fn new_scanner_file #
fn new_scanner_file(file_path string, comments_mode CommentsMode, pref_ &pref.Preferences) !&Scanner
new scanner from file.
fn new_silent_scanner #
fn new_silent_scanner() &Scanner
new_silent_scanner returns a new scanner instance, setup to just set internal flags and append errors to its .errors field, without aborting the program. It is mainly useful for programs that want to lex potentially invalid V source code repeatedly, and do their own error handling (checking .errors.len).
enum CommentsMode #
enum CommentsMode {
skip_comments
parse_comments
toplevel_comments
}
The different kinds of scanner modes:
.skip_comments - simplest/fastest, just ignores all comments early. This mode is used by the compiler itself.
.parse_comments is used by vfmt. Ideally it should handle inline /* */ comments too, i.e. it returns every kind of comment as a new token.
.toplevel_comments is used by vdoc, parses only top level ones that are outside structs/enums/fns.
struct Scanner #
struct Scanner {
pub mut:
file_path string // '/path/to/file.v'
file_base string // 'file.v'
text string // the whole text of the file
pos int = -1 // current position in the file, first character is s.text[0]
line_nr int // current line number
last_nl_pos int = -1 // for calculating column
is_crlf bool // special check when computing columns
is_inside_string bool // set to true in a string, *at the start* of an $var or ${expr}
is_nested_string bool // '${'abc':-12s}'
is_inter_start bool // for hacky string interpolation TODO simplify
is_inter_end bool
is_enclosed_inter bool
is_nested_enclosed_inter bool
line_comment string
last_lt int = -1 // position of latest <
is_print_line_on_error bool
is_print_colored_error bool
is_print_rel_paths_on_error bool
quote u8 // which quote is used to denote current string: ' or "
inter_quote u8
just_closed_inter bool // if is_enclosed_inter was set to false on the previous character: `}`
nr_lines int // total number of lines in the source file that were scanned
is_vh bool // Keep newlines
is_fmt bool // Used for v fmt.
comments_mode CommentsMode
is_inside_toplvl_statement bool // *only* used in comments_mode: .toplevel_comments, toggled by parser
all_tokens []token.Token // *only* used in comments_mode: .toplevel_comments, contains all tokens
tidx int
eofs int
max_eofs int = 50
inter_cbr_count int
pref &pref.Preferences
error_details []string
errors []errors.Error
warnings []errors.Warning
notices []errors.Notice
should_abort bool // when too many errors/warnings/notices are accumulated, should_abort becomes true, and the scanner should stop
// the following are used only inside ident_string, but are here to avoid allocating new arrays for the most common case of strings without escapes
all_pos []int
u16_escapes_pos []int // pos list of \uXXXX
u32_escapes_pos []int // pos list of \UXXXXXXXX
h_escapes_pos []int // pos list of \xXX
str_segments []string
}
fn (Scanner) free #
fn (mut s Scanner) free()
fn (Scanner) set_is_inside_toplevel_statement #
fn (mut s Scanner) set_is_inside_toplevel_statement(newstate bool)
Note: this is called by v's parser
fn (Scanner) set_current_tidx #
fn (mut s Scanner) set_current_tidx(cidx int)
fn (Scanner) scan #
fn (mut s Scanner) scan() token.Token
fn (Scanner) peek_token #
fn (s &Scanner) peek_token(n int) token.Token
fn (Scanner) text_scan #
fn (mut s Scanner) text_scan() token.Token
text_scan returns a single token from the text, and updates the scanner state, so that it will be ready to get the next token right after that. See also Scanner.prepare_for_new_text and new_silent_scanner()
fn (Scanner) ident_string #
fn (mut s Scanner) ident_string() string
ident_string returns a lexed V string, starting from the current position in the text it supports r'strings', c'strings', interpolated 'strings' and "strings", and hex escapes in them (except in the r'strings' where the content is returned verbatim)
fn (Scanner) ident_char #
fn (mut s Scanner) ident_char() string
/ ident_char is called when a backtick "single-char" is parsed from the code / it is needed because some runes (chars) are written with escape sequences / the string it returns should be a standardized, simplified version of the character / as it would appear in source code / possibilities: / single chars like a
, b
=> 'a', 'b' / escaped single chars like \\
, \``,
\n => '\\', '
', '\n' / escaped single hex bytes like \x01
, \x61
=> '\x01', 'a' / escaped unicode literals like \u2605
/ escaped unicode 32 literals like \U00002605
/ escaped utf8 runes in hex like \xe2\x98\x85
=> (★) / escaped utf8 runes in octal like \342\230\205
=> (★)
fn (Scanner) current_pos #
fn (mut s Scanner) current_pos() token.Pos
fn (Scanner) note #
fn (mut s Scanner) note(msg string)
fn (Scanner) add_error_detail #
fn (mut s Scanner) add_error_detail(msg string)
call this before calling error or warn
fn (Scanner) add_error_detail_with_pos #
fn (mut s Scanner) add_error_detail_with_pos(msg string, pos token.Pos)
fn (Scanner) warn #
fn (mut s Scanner) warn(msg string)
fn (Scanner) warn_with_pos #
fn (mut s Scanner) warn_with_pos(msg string, pos token.Pos)
fn (Scanner) error #
fn (mut s Scanner) error(msg string)
fn (Scanner) error_with_pos #
fn (mut s Scanner) error_with_pos(msg string, pos token.Pos)
fn (Scanner) prepare_for_new_text #
fn (mut s Scanner) prepare_for_new_text(text string)
prepare_for_new_text resets the internal state of the scanner, so that it can be reused for scanning the new text, given by text
, using a subsequent s.scan_text() call, to get the token corresponding to the text.