html - Storing malicious code in a database - is escape-on-output always the correct approach? -


just want understand thinking here , arrive @ correct , accepted approach issue. context in web environment , talking escaping on input database.

i understand many of reasons behind not escaping on input when taking user input , storing database. might want use input in variety of different ways (as json, sms etc) , might want show input user in original form.

before putting database make sure there no sql injection attacks protect database.

however following principals set out here , here, suggest approach of saving user input is. user input might not sql injection attack, other malicious code. in these cases ok store javascript based xss attacks database?

i want know if assumptions here correct, fine storing malicious code in database long malicious code doesn't directly affect database? case of not being database's problem, can hold malicious code , output device avoid pitfalls of malicious code?

or should doing more escaping on input suggested these principals - security concerns come before idea of escaping on output? should take approach no malicious code enters database? why want store malicious code anyway?

what correct approach saving malicious code database in context of web client/server environment?

[for purposes of ignoring sites allow code shared on them, thinking of "normal" inputs such name, comment , description fields.]

definition: use term "sanitize" instead of filter or escape, because there's third option: rejecting invalid input. example, returning error user saying "character ‽ may not used in title" prevents ever having store @ all.

saving user input is

the security principle of "defense in depth" suggests should sanitize potential malicious input , possible. whitelist values , strings useful application. if do, you'll have encode/escape these values too.

why want store malicious code anyway?

there times accuracy more important paranoia. example: user feedback may need include potentially disruptive code. imagine writing user feedback says, "every time use type %00 part of wiki title application crashes." if wiki titles don't need %00 characters, comment should still transmit them accurately. failing allow in comments prevents operators learning serious issue. see: null byte injection

up output device avoid pitfalls of malicious code

if need store arbitrary data, correct approach escape switch other encoding type. note must decode (unescape) , encode (escape); there no such thing non-encoded data - binary @ least big-endian or small-endian. folks use language's built in strings 'most decoded' format, can wonky when considering unicode vs ascii. user input in web applications urlencoded, http encoded, or encoded according "content-type" header. see: http://www.ietf.org/rfc/rfc2616.txt

most systems part of templating or parameterized queries. example, parameterized query function query("insert table values (?)", name) prevent need escape single quotes or else in name. if don't have convenience this, helps create objects track data per encoding type, such htmlstring constructor newhtmlstring(string) , decode() function.

should take approach no malicious code enters database?

because database cannot determine future possible encodings, impossible sanitize against potential injections. example, sql , html may not care backticks, javascript , bash do.


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -