Skip to content

v.scanner #

fn new_scanner #

fn new_scanner(text string, comments_mode CommentsMode, pref_ &pref.Preferences) &Scanner

new scanner from string.

fn new_scanner_file #

fn new_scanner_file(file_path string, comments_mode CommentsMode, pref_ &pref.Preferences) !&Scanner

new scanner from file.

fn new_silent_scanner #

fn new_silent_scanner() &Scanner

new_silent_scanner returns a new scanner instance, setup to just set internal flags and append errors to its .errors field, without aborting the program. It is mainly useful for programs that want to lex potentially invalid V source code repeatedly, and do their own error handling (checking .errors.len).

enum CommentsMode #

enum CommentsMode {
	skip_comments
	parse_comments
	toplevel_comments
}

The different kinds of scanner modes:

.skip_comments - simplest/fastest, just ignores all comments early. This mode is used by the compiler itself.

.parse_comments is used by vfmt. Ideally it should handle inline /* */ comments too, i.e. it returns every kind of comment as a new token.

.toplevel_comments is used by vdoc, parses only top level ones that are outside structs/enums/fns.

struct Scanner #

@[minify]
struct Scanner {
pub mut:
	file_path                   string // '/path/to/file.v'
	file_base                   string // 'file.v'
	text                        string // the whole text of the file
	pos                         int = -1 // current position in the file, first character is s.text[0]
	line_nr                     int // current line number
	last_nl_pos                 int = -1 // for calculating column
	is_crlf                     bool // special check when computing columns
	is_inside_string            bool // set to true in a string, *at the start* of an $var or ${expr}
	is_nested_string            bool // '${'abc':-12s}'
	is_inter_start              bool // for hacky string interpolation TODO simplify
	is_inter_end                bool
	is_enclosed_inter           bool
	is_nested_enclosed_inter    bool
	line_comment                string
	last_lt                     int = -1 // position of latest <
	is_print_line_on_error      bool
	is_print_colored_error      bool
	is_print_rel_paths_on_error bool
	quote                       u8 // which quote is used to denote current string: ' or "
	inter_quote                 u8
	just_closed_inter           bool // if is_enclosed_inter was set to false on the previous character: `}`
	nr_lines                    int  // total number of lines in the source file that were scanned
	is_vh                       bool // Keep newlines
	is_fmt                      bool // Used for v fmt.
	comments_mode               CommentsMode
	is_inside_toplvl_statement  bool          // *only* used in comments_mode: .toplevel_comments, toggled by parser
	all_tokens                  []token.Token // *only* used in comments_mode: .toplevel_comments, contains all tokens
	tidx                        int
	eofs                        int
	max_eofs                    int = 50
	inter_cbr_count             int
	pref                        &pref.Preferences
	error_details               []string
	errors                      []errors.Error
	warnings                    []errors.Warning
	notices                     []errors.Notice
	should_abort                bool // when too many errors/warnings/notices are accumulated, should_abort becomes true, and the scanner should stop

	// the following are used only inside ident_string, but are here to avoid allocating new arrays for the most common case of strings without escapes
	all_pos         []int
	u16_escapes_pos []int // pos list of \uXXXX
	u32_escapes_pos []int // pos list of \UXXXXXXXX
	h_escapes_pos   []int // pos list of \xXX
	str_segments    []string
}

fn (Scanner) free #

unsafe
fn (mut s Scanner) free()

fn (Scanner) set_is_inside_toplevel_statement #

fn (mut s Scanner) set_is_inside_toplevel_statement(newstate bool)

Note: this is called by v's parser

fn (Scanner) set_current_tidx #

fn (mut s Scanner) set_current_tidx(cidx int)

fn (Scanner) scan #

fn (mut s Scanner) scan() token.Token

fn (Scanner) peek_token #

fn (s &Scanner) peek_token(n int) token.Token

fn (Scanner) text_scan #

fn (mut s Scanner) text_scan() token.Token

text_scan returns a single token from the text, and updates the scanner state, so that it will be ready to get the next token right after that. See also Scanner.prepare_for_new_text and new_silent_scanner()

fn (Scanner) ident_string #

fn (mut s Scanner) ident_string() string

ident_string returns a lexed V string, starting from the current position in the text it supports r'strings', c'strings', interpolated 'strings' and "strings", and hex escapes in them (except in the r'strings' where the content is returned verbatim)

fn (Scanner) ident_char #

fn (mut s Scanner) ident_char() string

/ ident_char is called when a backtick "single-char" is parsed from the code / it is needed because some runes (chars) are written with escape sequences / the string it returns should be a standardized, simplified version of the character / as it would appear in source code / possibilities: / single chars like a, b => 'a', 'b' / escaped single chars like \\, \``, \n => '\\', '', '\n' / escaped single hex bytes like \x01, \x61 => '\x01', 'a' / escaped unicode literals like \u2605 / escaped unicode 32 literals like \U00002605 / escaped utf8 runes in hex like \xe2\x98\x85 => (★) / escaped utf8 runes in octal like \342\230\205 => (★)

fn (Scanner) current_pos #

fn (mut s Scanner) current_pos() token.Pos

fn (Scanner) note #

fn (mut s Scanner) note(msg string)

fn (Scanner) add_error_detail #

fn (mut s Scanner) add_error_detail(msg string)

call this before calling error or warn

fn (Scanner) add_error_detail_with_pos #

fn (mut s Scanner) add_error_detail_with_pos(msg string, pos token.Pos)

fn (Scanner) warn #

fn (mut s Scanner) warn(msg string)

fn (Scanner) warn_with_pos #

fn (mut s Scanner) warn_with_pos(msg string, pos token.Pos)

fn (Scanner) error #

fn (mut s Scanner) error(msg string)

fn (Scanner) error_with_pos #

fn (mut s Scanner) error_with_pos(msg string, pos token.Pos)

fn (Scanner) prepare_for_new_text #

fn (mut s Scanner) prepare_for_new_text(text string)

prepare_for_new_text resets the internal state of the scanner, so that it can be reused for scanning the new text, given by text, using a subsequent s.scan_text() call, to get the token corresponding to the text.