By Our Powers Combined: Observability for Developers

A presentation at Abstractions II in August 2019 in Pittsburgh, PA, USA by Aaron Aldrich

Slide 1

Slide 1

BY OUR POWERS COMBINED: OBSERVABILITY FOR DEVELOPERS @CrayZeigh — #Abstractions

Slide 2

Slide 2

HI, ABSTRACTIONS!

! $ ” : theelasticast.com : @CrayZeigh : aaron.aldrich@elastic.co : noti.st/crayzeigh

!

Slide 3

Slide 3

OBSERVABILITY @CrayZeigh — #Abstractions

Slide 4

Slide 4

DEVOPS @CrayZeigh — #Abstractions

Slide 5

Slide 5

DEVOPS @CrayZeigh — #Abstractions

Slide 6

Slide 6

@CrayZeigh — #Abstractions

Slide 7

Slide 7

@CrayZeigh — #Abstractions

Slide 8

Slide 8

DEVOPS > Wave 1: Ops learns code & automation 1 > Wave 2: Dev owns code through production 1 https://vimeo.com/341142053 @CrayZeigh — #Abstractions

Slide 9

Slide 9

*Simon Wardley: https://twitter.com/swardley/status/1014883354481741825?lang=en

Slide 10

Slide 10

https://dev.to/molly_struve/making-on-call-not-suck-490

Slide 11

Slide 11

SHARED LANGUAGE @CrayZeigh — #Abstractions

Slide 12

Slide 12

SHARED TOOLS @CrayZeigh — #Abstractions

Slide 13

Slide 13

SHARED SOURCE OF TRUTH @CrayZeigh — #Abstractions

Slide 14

Slide 14

OBSERVABILITY A system is observable when you can ask arbitrary questions about it and receive meaningful answers without having to resort to writing new code or command line tools. It lets you discover unknown-unknowns and debug in production. @CrayZeigh — #Abstractions

Slide 15

Slide 15

Isn’t it just Monitoring with better SEO? — You @CrayZeigh — #Abstractions

Slide 16

Slide 16

YOU’RE NOT WRONG.. @CrayZeigh — #Abstractions

Slide 17

Slide 17

@CrayZeigh — #Abstractions

Slide 18

Slide 18

@CrayZeigh — #Abstractions

Slide 19

Slide 19

Software is inherently opaque, we have to instrument it to output meaningful information @CrayZeigh — #Abstractions

Slide 20

Slide 20

THE THREE PILLARS OF OBSERVABILITY 1. Logs 2. Metrics 3. APM @CrayZeigh — #Abstractions

Slide 21

Slide 21

THE THREE PILLARS OF OBSERVABILITY 1. Logs Events 2. Metrics 3. APM @CrayZeigh — #Abstractions

Slide 22

Slide 22

THE THREE PILLARS OF OBSERVABILITY 1. Logs Events 2. Metrics 3. APM 4. Distributed Tracing @CrayZeigh — #Abstractions

Slide 23

Slide 23

THE THREE PILLARS OF OBSERVABILITY 1. Logs Events 2. Metrics 3. APM & Distributed Tracing 4. Distributed Tracing @CrayZeigh — #Abstractions

Slide 24

Slide 24

@CrayZeigh — #Abstractions

Slide 25

Slide 25

LOGS/METRICS/APM ARE THE MEDIA WE WORK IN @CrayZeigh — #Abstractions

Slide 26

Slide 26

METRICS @CrayZeigh — #Abstractions

Slide 27

Slide 27

@CrayZeigh — #Abstractions

Slide 28

Slide 28

EVENTS @CrayZeigh — #Abstractions

Slide 29

Slide 29

CARDINALITY & YOU @CrayZeigh — #Abstractions

Slide 30

Slide 30

EXISTING LOGS 66.249.65.159 - - [06/Nov/2014:19:10:38 +0600] “GET /news/53f8d72920ba2744fe873ebc.html HTTP/1.1” 404 177 “-” “Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version /6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” 66.249.65.3 - - [06/Nov/2014:19:11:24 +0600] “GET /?q=%E0%A6%AB%E0%A6%BE%E0%A7%9F%E0%A6%BE%E0%A6%B0 HTTP/ 1.1” 200 4223 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” 66.249.65.62 - - [06/Nov/2014:19:12:14 +0600] “GET /?q=%E0%A6%A6%E0%A7%8B%E0%A7%9F%E0%A6%BE HTTP/1.1” 200 4356 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)” @CrayZeigh — #Abstractions

Slide 31

Slide 31

STRUCTURED LOGGING { “@timestamp”: “2019-08-04T12:30:04.000Z”, … “container”: { “image.id”: “48f5af6667f3457be0a2c7814caefe21ed3c94fb94bd6243096b3a61ea502b1d”, “version”: “version”, }, “build.id”: “efdd0b5e69b0742fa5e5bad0771df4d1df2459d1” … “transaction”: “transaction_ID”, “user”: “importantPerson”, “account”: “0129”, “os”: “osx”, … “api_endpoint”: “endpoint”, … “response”: 400, … “message”: “Some informative thing, probably more human readable and friendly, but difficult to parse” } @CrayZeigh — #Abstractions

Slide 32

Slide 32

APM & TRACING @CrayZeigh — #Abstractions

Slide 33

Slide 33

@CrayZeigh — #Abstractions

Slide 34

Slide 34

@CrayZeigh — #Abstractions

Slide 35

Slide 35

@CrayZeigh — #Abstractions

Slide 36

Slide 36

SHARED LANGUAGE > Learning to speak Prod > Teaching Prod to speak Dev (Structured Logs; Traces) @CrayZeigh — #Abstractions

Slide 37

Slide 37

SHARED TOOLS > Debugging in Prod > Ops skills transferrable and replicable > New knowledge and methods shareable @CrayZeigh — #Abstractions

Slide 38

Slide 38

SHARED SOURCE OF TRUTH > Real production data > Draw better lines from code to prod > Write better, production ready code. @CrayZeigh — #Abstractions

Slide 39

Slide 39

WHERE DO WE GO FROM HERE? @CrayZeigh — #Abstractions

Slide 40

Slide 40

Testing & Experimentation @CrayZeigh — #Abstractions

Slide 41

Slide 41

TEST IN PRODUCTION @CrayZeigh — #Abstractions

Slide 42

Slide 42

@CrayZeigh — #Abstractions

Slide 43

Slide 43

DON’T DEBATE EXPERIMENT @CrayZeigh — #Abstractions

Slide 44

Slide 44

2 Speaking of Testing : 2 QA https://theelasticast.com/episodes/0017-qa/ @CrayZeigh — #Abstractions

Slide 45

Slide 45

Slide 46

Slide 46

DEVOPS > Wave 1: Ops learns code & automation 1 > Wave 2: Dev owns code through production 1 https://vimeo.com/341142053 @CrayZeigh — #Abstractions

Slide 47

Slide 47

DEV OWNS CODE THROUGH PRODUCTION > Better, more production-ready code > Real World experimentation > Improved operational resiliency @CrayZeigh — #Abstractions

Slide 48

Slide 48

THE POWER IS YOURS

Slide 49

Slide 49

THANKS! > Slides & References: noti.st/crayzeigh > Trial: ela.st/aaron-aldrich-trial @CrayZeigh — #Abstractions