tom.gillman
2018-10-01 17:29
How often do you update drivers in the sledgehammer image?

zehicle
2018-10-01 17:41
@tom.gillman which drivers are you thinking of? by design, sledgehammer is very stable and minimal.

tom.gillman
2018-10-01 17:41
i40e

zehicle
2018-10-01 17:47
are you having issues w/ sledgehammer not using the NIC?

tom.gillman
2018-10-01 17:56
Not exactly. I'm seeing an inconsistent NIC naming. I have 2x25G XXV710 cards, which initially identify as eth4-7, Most of them get renamed to enpXXs0f[0,1] as expected, except for the first one, which stays eth4. I'm trying to figure out if it's a f/w issue, or a driver issue, and I notice that particular driver is about 2.1.14-k, while Intel's current build is 2.4.10

tom.gillman
2018-10-01 18:01
I can actually work around this particular one, since all I really care about is the lldp info I'm getting. It's just a minor inconvenience.

greg
2018-10-01 18:02
Is this all within sledgehammer or sledgehammer compared to an installed os?

tom.gillman
2018-10-01 18:07
within sledgehammer.

tom.gillman
2018-10-01 18:08
```<sledgehammer> [root@esx05-r02 ~]# ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eno1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether e4:43:4b:04:b5:e8 brd ff:ff:ff:ff:ff:ff 3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether e4:43:4b:04:b5:e9 brd ff:ff:ff:ff:ff:ff 4: eno3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether e4:43:4b:04:b5:ea brd ff:ff:ff:ff:ff:ff 5: eno4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000 link/ether e4:43:4b:04:b5:eb brd ff:ff:ff:ff:ff:ff 6: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:c2:6a:d0 brd ff:ff:ff:ff:ff:ff 7: enp59s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:c2:6a:d1 brd ff:ff:ff:ff:ff:ff 8: enp135s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:c2:74:70 brd ff:ff:ff:ff:ff:ff 9: enp135s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 3c:fd:fe:c2:74:71 brd ff:ff:ff:ff:ff:ff ``` Look at line 6. I would expect that to be enp59s0f0

paulfantom
2018-10-01 18:18
Maybe a stupid question and not directly referring to rackN, but I figured you guys might be the best to answer that. My friend is looking for a "cloud" offering with access to iPXE boot and serial console, do you know any?

greg
2018-10-01 18:41
Yeah. I think that is a use issue. I?m not sure the kernel renames in use devices.

greg
2018-10-01 18:42
That is for @tom.gillman

greg
2018-10-01 18:42
@paulfantom - depending upon what your friend needs you might want to look at http://Packet.net.

greg
2018-10-01 18:43
They definitely do that as a ?cloud? offering.

paulfantom
2018-10-01 18:45
I'll let him know, thanks!

tom.gillman
2018-10-01 18:52
You're saying that because `eth4` is the primary interface it didn't get renamed?

tom.gillman
2018-10-01 18:57
good enough.

greg
2018-10-01 18:59
I think that is true.

florent.wagener
2018-10-01 19:22
@tom.gillman `drpcli gohai` contains the StableName for every NIC.

florent.wagener
2018-10-01 19:22
I ran into the same issue and @shane was kind enough to add this into gohai for me.

tom.gillman
2018-10-01 19:25
That's awesome. Thanks!

shane
2018-10-01 19:32
I think it was @vlowther that trued things up in the NIC naming - but happy that the change is useful :slightly_smiling_face:

vlowther
2018-10-01 20:03
@tom.gillman Yes, interfaces that are in use cannot be renamed, and depending on Sledgehammer version nic renaming may happen after we have brought the interface we PXE booted up.

florent.wagener
2018-10-01 20:06
@shane, you're right it was @vlowther :slightly_smiling_face: Thanks again @vlowther

tom.gillman
2018-10-01 20:07
That seems to be the case here. Thanks for the info, @vlowther

florent.wagener
2018-10-02 15:15
Hey there, do you guys have a solution to install HP firmware updates (SPP) with DRP ?

greg
2018-10-02 15:16
Plans and thoughts, but nothing yet. It would work like dell support. Native tools pulling in.

2018-10-02 15:16
Time to feed the :bear:!

florent.wagener
2018-10-02 15:19
:disappointed:

jmatthew
2018-10-02 19:13
Question, got DHCP finally working, but hung on the tftp prompt. Environment: Linux Mint VM in VirtualBox.

zehicle
2018-10-02 19:15
@jmatthew that's not enough information. Can you provide some more details?

jmatthew
2018-10-02 19:19
@zehicle thanks, what else do you need, sorry, just back to trying to get my local rackn working :slightly_smiling_face:. i'll reopen this chat in vm too

jmatthew
2018-10-02 19:20
```proto:dhcp4 iface:enp0s8 ifaddr:255.255.255.255:68 lport:67 op:0x02 htype:0x01 hlen:0x06 hops:0x00 xid:0x28745398 secs:0x0000 flags:0x8000 ci:0.0.0.0 yi:192.168.56.10 si:192.168.56.1 gi:0.0.0.0 ch:08:00:27:74:53:98 sname:"192.168.56.1" file:"lpxelinux.0" option:code:053 val:"ack" option:code:054 val:"192.168.56.1" option:code:051 val:"60" option:code:001 val:"255.255.255.0" option:code:003 val:"192.168.56.1" option:code:006 val:"192.168.56.1" option:code:015 val:"http://insightinvestments.com" option:code:028 val:"192.168.56.255" option:code:058 val:"30" option:code:059 val:"45" ```

jmatthew
2018-10-02 19:21
```[ { "ActiveEnd": "192.168.56.254", "ActiveLeaseTime": 60, "ActiveStart": "192.168.56.10", "Available": true, "Description": "", "Documentation": "", "Enabled": true, "Errors": [], "Meta": {}, "Name": "enp0s8", "NextServer": "192.168.56.1", "OnlyReservations": false, "Options": [ { "Code": 1, "Value": "255.255.255.0" }, { "Code": 3, "Value": "192.168.56.1" }, { "Code": 6, "Value": "192.168.56.1" }, { "Code": 15, "Value": "http://insightinvestments.com" }, { "Code": 28, "Value": "192.168.56.255" }, { "Code": 67, "Value": "lpxelinux.0" } ], "Pickers": [ "hint", "nextFree", "mostExpired" ], "Proxy": false, "ReadOnly": false, "ReservedLeaseTime": 7200, "Strategy": "MAC", "Subnet": "192.168.56.1/24", "Unmanaged": false, "Validated": true } ]

jmatthew
2018-10-02 19:21
```[ { "ActiveEnd": "192.168.56.254", "ActiveLeaseTime": 60, "ActiveStart": "192.168.56.10", "Available": true, "Description": "", "Documentation": "", "Enabled": true, "Errors": [], "Meta": {}, "Name": "enp0s8", "NextServer": "192.168.56.1", "OnlyReservations": false, "Options": [ { "Code": 1, "Value": "255.255.255.0" }, { "Code": 3, "Value": "192.168.56.1" }, { "Code": 6, "Value": "192.168.56.1" }, { "Code": 15, "Value": "http://insightinvestments.com" }, { "Code": 28, "Value": "192.168.56.255" }, { "Code": 67, "Value": "lpxelinux.0" } ], "Pickers": [ "hint", "nextFree", "mostExpired" ], "Proxy": false, "ReadOnly": false, "ReservedLeaseTime": 7200, "Strategy": "MAC", "Subnet": "192.168.56.1/24", "Unmanaged": false, "Validated": true } ] ```

jmatthew
2018-10-02 19:26
OMG, never mind, PEBKAK

jmatthew
2018-10-02 19:26
a default workflow helps :slightly_smiling_face:

greg
2018-10-02 19:26
cool - glad you found it

jmatthew
2018-10-02 19:28
now i can move forward with install :slightly_smiling_face:

bagricola
2018-10-03 11:36
having a bit of a brain fart - is there a var to get the current machine *name* in a template?

greg
2018-10-03 12:14
.Machine.Name

greg
2018-10-03 12:14
@bagricola

bagricola
2018-10-03 12:26
ahh

bagricola
2018-10-03 12:26
excellent, thanks

bagricola
2018-10-03 12:28
I just got Ansible AWX jobs triggering correctly post C7 install :slightly_smiling_face: minor annoyances due to how AWX handles provision callbacks but seems to work well

d.schrimpsher
2018-10-03 21:28
Related to Machine params.. Do you 1)create the Params directly in the machine endpoint, 2) create params using /params then attach to the machine or 3) either or

d.schrimpsher
2018-10-03 21:31
nevermind Randy answered it

shane
2018-10-03 21:31
Thanks, Randy!

greg
2018-10-03 23:01
:slightly_smiling_face: I can ramble on a lot more on the topic if Randy didn?t cover it . :stuck_out_tongue:

r.levensalor
2018-10-04 01:14
Thanks guys. Based on our earlier conversations, we are just going to apply the params to the individual machines.

shane
2018-10-04 01:16
Randy, depending on how many you want to apply, you might want to create a profile, put the params in the profile, and then add the profile to the machines - but that's just a logical construct that helps keep things grouped together

r.levensalor
2018-10-04 01:19
Shane. Thanks. That is how we had started. But this is only a half a dozen servers that will be created with scripts.

zehicle
2018-10-04 01:24
@r.levensalor if you are creating the params (highly recommended) then make sure you set the default. If so you do that, you can read the params on a machine (in the runner template expansion) even if it's not set on a machine

r.levensalor
2018-10-04 01:27
Thanks Rob. I'll add that to the backlog.

b.quan
2018-10-04 05:11
Is there a more detailed doc describing how to update a content package? After uploading a content package, I only see "Remove" button for the content package in UI, but did not see an edit button. One thing I wanted to do for the uploaded content package is to change a template in the content pack, but using vi to change the template (e.g., a seed template) is really painful.

greg
2018-10-04 05:28
There is a video in the RackN YouTube channel talking about the color demo and how to edit and control content packages.

greg
2018-10-04 05:28
There is not an edit in the UX at that point. The point of the content package is to be read only. @b.quan


zehicle
2018-10-04 12:24
@b.quan ^^

b.quan
2018-10-04 14:03
Thank you Greg and Rob for the clarification and pointer!

akbhat
2018-10-04 15:04
has joined #community201810

zehicle
2018-10-04 15:30
@akbhat $welcome

2018-10-04 15:30
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

d.schrimpsher
2018-10-04 19:17
Setting options for subnets. Can you give me an example of the ?value? field

d.schrimpsher
2018-10-04 19:17
what we have in our dhcp.conf is

d.schrimpsher
2018-10-04 19:17
option option-128 code 128 = string

d.schrimpsher
2018-10-04 19:17
so I think option-128 but might be string?


d.schrimpsher
2018-10-04 19:18
when in doubt read the directions

d.schrimpsher
2018-10-04 19:18
:slightly_smiling_face:

shane
2018-10-04 19:18
The "Description" fields in that JSON example are superfluous - they are ignored - and just there to help document the actual Option number a little better

d.schrimpsher
2018-10-04 19:19
Ok I get it now

shane
2018-10-04 19:21
more details can be found in the Architecture Reference pages too - and DHCP/Subnets have a little more info at: https://provision.readthedocs.io/en/latest/doc/arch/dhcp.html

shane
2018-10-04 19:23
@d.schrimpsher - does that help answer the question ?

d.schrimpsher
2018-10-04 19:24
yeap I have to look up the integer codes for the options. We used cheats (domain-name-server instead of 6 for instance)

shane
2018-10-04 19:25
yeah - that would be a nice enhancement to our Subnet model ... :slightly_smiling_face:

d.schrimpsher
2018-10-04 19:31
no biggy

d.schrimpsher
2018-10-04 19:32
So is there anyway to handle this sort of thing or is that a workflow (or something else) problem

d.schrimpsher
2018-10-04 19:33
as far as EFI vs BIOS

shane
2018-10-04 19:34

shane
2018-10-04 19:35
when @greg or @vlowther get a chance to respond - they may have some thoughts on what you're trying to do with UEFI - we do have a lot of baked in support to try and magically handle UEFI boot modes easily - but as you likely know - UEFI is massively broken across almost every single vendor implementation ... :disappointed:

d.schrimpsher
2018-10-04 19:36
no!, really?

d.schrimpsher
2018-10-04 19:36
It looks nice on paper :slightly_smiling_face:

d.schrimpsher
2018-10-04 19:36
but pain -in -the -ass in real life

vlowther
2018-10-04 19:39
represses ranting instincts.

d.schrimpsher
2018-10-04 19:42
especialy with that legacy bull*

greg
2018-10-04 19:48
@d.schrimpsher - I think that the tip code will attempt to ipxe (or send ipxe bootloaders that work for UEFI and legacy. If you want to do what you posted, you can do what @shane referenced.

d.schrimpsher
2018-10-04 20:55
I think I am going to leave it out for now and see how it works

shane
2018-10-04 21:54
feedback appreciated on success :slightly_smiling_face:

greg
2018-10-05 00:18
- Updating to latest drp-community-content will require a sledgehammer update.

greg
2018-10-05 00:18
By latest I mean tip.

greg
2018-10-05 00:19
This sledgehammer and updated content packages (both community and rackn) have been updated to support IPv6 environments.

zehicle
2018-10-05 00:20
which allows DRP to work in environments WITHOUT IPv4

zehicle
2018-10-05 00:21
note: that is NOT out of the box function. it's possible, but takes some expertise

florent.wagener
2018-10-05 00:22
@zehicle meaning with IPv6 ?

florent.wagener
2018-10-05 00:22
yeah

zehicle
2018-10-05 00:22
yes @florent.wagener we're working with someone who is exclusively IPv6

florent.wagener
2018-10-05 00:23
I just read greg's comment to the end :smile:

zehicle
2018-10-05 00:23
the magic is not just the IPv6 part, it's the without IPv4 part that's really tricky

b.quan
2018-10-05 15:46
Related to content package, can you include workflows into a bundle?

greg
2018-10-05 15:51
yes

greg
2018-10-05 15:51
just another directory, `workflows`, with yaml files. :slightly_smiling_face:

b.quan
2018-10-05 16:06
cool

b.quan
2018-10-05 16:19
@greg I assume json files are ok too for workflows?

shane
2018-10-05 16:26
@b.quan - yes you can use JSON, the `drpcli contents bundle ...` operation will convert it to YAML in the final artifact that it creates

b.quan
2018-10-05 16:26
Thanks @shane for confirming

d.schrimpsher
2018-10-05 16:50
So on subnet naming. What are the rules? I was trying a uuid and it yelled at me

d.schrimpsher
2018-10-05 16:50
Error on 852b4776-6148-4eca-8b79-3fed1041ef86: Invalid Name `852b4776-6148-4eca-8b79-3fed1041ef86`

b.quan
2018-10-05 16:53
@shane When you upload a content package, what happens to the existing objects (templates, workflows, etc.) with the same name defined in the bundle?

d.schrimpsher
2018-10-05 16:58
nevermind it didn?t like it starting with a number

greg
2018-10-05 17:18
@b.quan - it won?t load because of conflict.

greg
2018-10-05 17:19
You may want to look at the new and poorly documented bundlizer

greg
2018-10-05 17:19
it will take a set of existing read/write objects and build a content package.

greg
2018-10-05 17:19
you can optionally have it delete those read/write objects and reload them as the content package.

greg
2018-10-05 17:19
drpcli contents bundlizer

b.quan
2018-10-05 17:23
@greg Are you sure? From my experience, the content package shows up in UI after I upload it with some minor metadata changes like name, description with existing objects. It did not complain.

b.quan
2018-10-05 17:26
It would be nice that the upload process checks if the duplicate objects are exactly the same as existing ones, and only uploads new ones that did not exist. Only throws out an error if true conflicts exist (i.e., defined ones in the bundle are different from existing ones with the same name)

greg
2018-10-05 17:28
Yes if they are in content packs to begin with.

greg
2018-10-05 17:30
As long as you are replacing them from within the same cnotent package.

b.quan
2018-10-05 17:30
gotcha

mat.marini
2018-10-05 19:48
has joined #community201810

zehicle
2018-10-05 23:08
@mat.marini $welcome

2018-10-05 23:08
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

zehicle
2018-10-06 00:40
KRIB NOTICE: if you are doing a fresh build, set the version to 1.12.1

zehicle
2018-10-06 00:40
I have a pull request out to update the version since the release

zehicle
2018-10-06 00:41
the version skew with kubeadm defaults breaks the install unless you update to latest

zehicle
2018-10-06 00:41
congratulates Kubernetes on a new release

akbhat
2018-10-06 13:26
noob question: Does DRP provide something like Apache Guacamole remote access? Console access to host & VMs?

zehicle
2018-10-06 13:33
Sorry, no. Would be an interesting integration, likely not a hard one

zehicle
2018-10-06 13:35
RackN does have IPMI integration that enables remote access

zehicle
2018-10-06 13:37
Typically, DRP operators use SSH or the BMC terminal, so no other remote desktop is needed.

akbhat
2018-10-06 14:25
IPMI is not available in the free tier, correct?

greg
2018-10-06 14:26
It is RackN licensed component. You can get a trial license to play with it.

greg
2018-10-06 14:26
The IPMI plugin provides both configuration of the BMC and common actions (though serial console is not one of those currently).

greg
2018-10-06 14:27
@akbhat - Hope that helps.

ctrees
2018-10-08 20:23
So... in UX CI/CD quest... the security guys are into using docker artifacts... I see: https://hub.docker.com/r/digitalrebar/provision/~/dockerfile/

ctrees
2018-10-08 20:25
is anybody actively using the dockerhub artifact for other testing on travis ? (If not I'm going to attempt one)

ctrees
2018-10-08 20:31
I am planning on using the CI/CD travis flow for cli pen testing, but if someone is pulling the provision artifact for some other community purpose, I'll fork that and add some cli pen testing.

ctrees
2018-10-08 20:57
humm... I don't think I'm following the docker artifact as I see: 2018-02-14 23:55:48 in the step which seems out-of-date... I have not done the dockerhub thing for a long time... I'll ask in community meetup tomorrow

zehicle
2018-10-08 23:23
Yes, a lot of people use that

ctrees
2018-10-08 23:25
I got confused reading the dockerhub logs... and where the artifact was coming from

ctrees
2018-10-08 23:28
As I disassembled what the security folks are playing with, it's basically a cli runner with 'security DSL' on top of cucumber... so I figured... heck build a package for drp cli that fits with the tool chain.

ctrees
2018-10-08 23:31
they basically put the target into a docker container, and put tools in another... I also got to thinking that drp-cli test has LOTS of test code buried in go-yaml ? (at least it looks like it)...

ctrees
2018-10-08 23:51
so thinking I can put dr-provision in UUT container, dr-cli in the 'test-tools' container, pull the cli tests from *_test.go into a *_test.feature (that's eventually saved in the docs directory) and run it almost like what they are using (CI/CD wise) for pen testing.

ctrees
2018-10-08 23:57
... just all the layers and meta is twisting my brain right now... may be eating my own tail... better go watch football...

zehicle
2018-10-09 14:09
online meetup today talking about some of our latest extensions including v6. 1 CT / 11 PT https://www.meetup.com/digitalrebar/events/lchdhpyxnbmb/

digital.rebar.slack
2018-10-09 14:55
has joined #community201810

digital.rebar.slack
2018-10-09 15:07
Hi, are you any closer to building sledgehammer to run on Arm64 hardware? I've been away for a while and am restarting this poc.

shane
2018-10-09 15:16
@digital.rebar.slack - $welcome ...

2018-10-09 15:16
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-10-09 15:16
and ... yes, we've done some work in that area - not sure if the arm64 build is quite ready yet - but check in with the Meetup today (posting above) for details

digital.rebar.slack
2018-10-09 15:23
thanks @shane. I have a need to get this working on a 20xRock64 cluster or two! I am setting up DR on one of the nodes and having the rest install from that.

digital.rebar.slack
2018-10-09 15:25
...I will start off trying to get it to PXE boot the 19x clients and then possibly use Ansible to install Kubernetes. I already use Krib on x86 machines and that works a dream!!!!

shane
2018-10-09 15:25
If there are any patches needed for KRIB / ARM build - we'd happily accept Pull Requests .... :slightly_smiling_face:

digital.rebar.slack
2018-10-09 15:27
:wink: of course! Sledgehammer is the first stumbling block. Any pointers as to what will need doing there to get that working on Arm?

greg
2018-10-09 15:27
I currently need an environment to build busy box.

greg
2018-10-09 15:29
and some priority. In order to get the ipv6 support working, I broke the arm64 sledgehammer. I need to get an arm64 busybox built and then do some building gyrations and then we need to test. There are pull requests out there, but it still needs some work. Like an arm64 jq and some tweaks to content packs to get that jq instead of the amd64 version. The basics are there, but the polishing needs to continue.

greg
2018-10-09 15:31
@vlowther has made a lot of progress on it.

digital.rebar.slack
2018-10-09 15:34
...there is/was an arm64 sledgehammer? That is, it runs on an arm machine?

greg
2018-10-09 15:42
Yes - in dev under a set of PRs on DR Provision and community-content.

vlowther
2018-10-09 15:43

vlowther
2018-10-09 15:44
at least, fully functional in that things operate in an arm64 QEMU session.

vlowther
2018-10-09 15:45
What I do not have is a jq that would wind up on an installed system if it was installed using the traditional OS installer

vlowther
2018-10-09 15:45
but I would recommend using image-based installers anyways.

digital.rebar.slack
2018-10-09 15:47
...thanks, I'll take a look.

ben.le
2018-10-09 17:06
I just add the subnet and ?drpcli info status? shows the DHCP and TFTP are enabled. However my new machine can?t pickup the IP address from DHCP service. Any ideas?

ben.le
2018-10-09 17:07
a client machine on the same subnet

shane
2018-10-09 17:15
is the Subnet "enabled" (see the Portal UX, or CLI (`drpcli subnets show SubnetName | jq '.Enabled'`). Also are you client machines bare metal or VMs on a host/hypervisor ? you can also turn up the DHCP logging to make sure that the DHCP packets are hitting the DRP endpoint

ben.le
2018-10-09 17:17
the subnet is enabled ?true?

ben.le
2018-10-09 17:18
my machine is bare metal

shane
2018-10-09 17:28
have you tried the DHCP logging option to see if you're getting log entries for that traffic ?

ben.le
2018-10-09 17:44
what?s the logging level to enable the DHCP?

ben.le
2018-10-09 17:51
Setting the DHCP?s log level to ?debug?, but i don?t see much information regarding to DHCP in the logs

ben.le
2018-10-09 17:52
# drpcli prefs set ?{?debugDhcp?:?debug?}? { ?baseTokenSecret?: ?FiNEDnUzgyBZ5VEp2YNe53SJ9ZfNHO9F?, ?debugBootEnv?: ?warn?, ?debugDhcp?: ?debug?, ?debugFrontend?: ?warn?, ?debugPlugins?: ?warn?, ?debugRenderer?: ?warn?, ?defaultBootEnv?: ?local?,

shane
2018-10-09 17:52
yep - DHCP should do it - if DHCP packets are climbing up the stack to DRP

ben.le
2018-10-09 17:52
Oct 09 09:46:10 smf-mvx-drp-stage-2 systemd[1]: dr-provision.service failed. Oct 09 10:01:00 smf-mvx-drp-stage-2 systemd[1]: Started DigitalRebar Provision Integrated DHCP and File Provisioner. Oct 09 10:01:00 smf-mvx-drp-stage-2 systemd[1]: Starting DigitalRebar Provision Integrated DHCP and File Provisioner... Oct 09 10:01:00 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:00.891942 Version: v3.11.0-0-8da90ceaf50f97c0d645c081f0384 Oct 09 10:01:00 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:00.892474 Extracting Default Assets Oct 09 10:01:00 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:00.892491 Extracting Default Assets Oct 09 10:01:02 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:02.549416 Starting metrics server Oct 09 10:01:03 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:03.055535 Starting TFTP server Oct 09 10:01:03 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:03.057402 Starting static file server Oct 09 10:01:03 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:03.057928 Starting DHCP server Oct 09 10:01:03 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:03.059289 Starting PXE/BINL server Oct 09 10:01:03 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:03.061634 Starting API server Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:04.074410 [4:1]:frontend [audit]: /home/travis/gopath/src/ Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: [4:1]Authenticated user rocketskates from 127.0.0.1 Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:04.122773 [11:2]dhcp:dhcp [ warn]: /home/travis/gopath/src Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: [11:2]No matching subnet, will respond to 0.0.0.0 from 127.0.0.1 Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:01:04.125088 [12:3]dhcp:dhcp [ warn]: /home/travis/gopath/src Oct 09 10:01:04 smf-mvx-drp-stage-2 dr-provision[55287]: [12:3]No matching subnet, will respond to 0.0.0.0 from 127.0.0.1 Oct 09 10:06:54 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:06:54.293298 [17:4]dhcp:dhcp [ warn]: /home/travis/gopath/src Oct 09 10:06:54 smf-mvx-drp-stage-2 dr-provision[55287]: [17:4]No matching subnet, will respond to 0.0.0.0 from 127.0.0.1 Oct 09 10:06:54 smf-mvx-drp-stage-2 dr-provision[55287]: dr-provision2018/10/09 17:06:54.297818 [18:5]dhcp:dhcp [ warn]: /home/travis/gopath/src Oct 09 10:06:54 smf-mvx-drp-stage-2 dr-provision[55287]: [18:5]No matching subnet, will respond to 0.0.0.0 from 127.0.0.1

shane
2018-10-09 17:52
I wouldn't suggest anything higher - as it may cause DRP to stop responding

ben.le
2018-10-09 17:53
got it

shane
2018-10-09 17:53
alternatively - you can run `tcpdump` (or similar) on the DRP endpoint and sniff for port 67/69 packets to make sure they are hitting your host

shane
2018-10-09 17:53
is it possible there is another DHCP server on that L2 segment ?

ben.le
2018-10-09 17:54
no, just only one DHCP enabled on DRP

ben.le
2018-10-09 17:55
# telnet localhost 67 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused You have new mail in /var/spool/mail/root

ben.le
2018-10-09 17:55
it seems like the port 67 does not enabled on drp server

shane
2018-10-09 17:56
67 is a UDP service, telnet is TCP - that should fail

shane
2018-10-09 17:56
you can see if DRP has the services enabled via the `drpcli info status` command - it can take a short bit to run, as it verifies the services are actually running with a basic "health check"

ben.le
2018-10-09 17:57
# drpcli info status { ?API?: { ?Alive?: true, ?Enabled?: true, ?Port?: 8092 }, ?BINL?: { ?Alive?: true, ?Enabled?: true, ?Port?: 4011 }, ?DHCP?: { ?Alive?: true, ?Enabled?: true, ?Port?: 67 }, ?Static?: { ?Alive?: true, ?Enabled?: true, ?Port?: 8091 }, ?TFTP?: { ?Alive?: true, ?Enabled?: true, ?Port?: 69 } }

ben.le
2018-10-09 17:57
both DHCP and TFTP are enabled

shane
2018-10-09 17:58
can you show the `dr-provision` service startup options ? specificaly do you have `--static-ip` set ?

ben.le
2018-10-09 17:59
# ps -ef |grep dr-pro root 55287 1 0 10:01 ? 00:00:03 /usr/local/bin/dr-provision

shane
2018-10-09 18:00
what network / subnet is your DRP on, and what is your Subnet config ? (`ip address show`, and `drpcli subnet show SubnetName`)

ben.le
2018-10-09 18:00
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:50:56:9a:71:f8 brd ff:ff:ff:ff:ff:ff inet 10.29.123.42/24 brd 10.29.123.255 scope global noprefixroute eth0

ben.le
2018-10-09 18:01
# drpcli subnets show local_subnet { ?ActiveEnd?: ?10.29.123.70", ?ActiveLeaseTime?: 60, ?ActiveStart?: ?10.29.123.50?, ?Available?: true, ?Description?: ??, ?Documentation?: ??, ?Enabled?: true, ?Errors?: [], ?Meta?: {}, ?Name?: ?local_subnet?, ?NextServer?: ??, ?OnlyReservations?: false, ?Options?: [ { ?Code?: 3, ?Value?: ?10.29.123.254" }, { ?Code?: 6, ?Value?: ?10.29.200.30? }, { ?Code?: 15, ?Value?: ?http://cso.fireeye.com? }, { ?Code?: 1, ?Value?: ?255.255.255.0? }, { ?Code?: 28, ?Value?: ?10.29.123.255" } ], ?Pickers?: [ ?hint?, ?nextFree?, ?mostExpired? ], ?Proxy?: false, ?ReadOnly?: false, ?ReservedLeaseTime?: 7200, ?Strategy?: ?MAC?, ?Subnet?: ?10.29.123.42/24?, ?Unmanaged?: false, ?Validated?: true }

shane
2018-10-09 18:06
@digital.rebar.slack and - the Meetup is running right now - if you wanted to talk ARM Sledgehammer - now is a good time :slightly_smiling_face: https://zoom.us/j/3403934274

ben.le
2018-10-09 18:14
@shane i just run dr-provision with --static-ip # /usr/local/bin/dr-provision --static-ip 10.29.123.42

shane
2018-10-09 18:15
@ben.le theoretically, you shouldn't need that - but if you have a number of Interfaces on the DRP endpoint, it's possible that we're not making the right selection for the packet responses

shane
2018-10-09 18:16
using `tcpdump` would be good on the DRP endpoint to see what the packets are doing

shane
2018-10-09 18:16
also - the `Subnet:` setting may need to be tweaked to `"Subnet": "10.29.123.0/24"` - but not sure if the setting you have would actually cause any issue

b.quan
2018-10-09 21:00
@greg @shane For the command: drpcli contents bundle [file] [meta fields] [flags], do you have an example that includes "[meta fields]"?

zehicle
2018-10-09 21:02
@b.quan in the content, there are ._[Field].meta files that contain meta fields. You can provide them from the CLI also using Name=Foo Description=Bar Version=123.23

b.quan
2018-10-09 21:03
cool, thanks @zehicle!

zehicle
2018-10-09 22:06
The meetup video did not record the demos, sorry all. We did take good notes: https://docs.google.com/document/d/1UxoDkkhzlTHztcLEmmwUweZG5DJQvbyIHYcaIZsbcFE/edit

cat
2018-10-10 13:41
has joined #community201810

cat
2018-10-10 13:47
Hey guys, this is my 'alter ego' for the open work I am doing with Bast23 domain that will include DevSecOps CI/CD work.

greg
2018-10-10 13:51
nice

ctrees
2018-10-10 13:53
This my 'alter ego' for mailservices work (USPS)... (Chris Trees, same person as catBast23) is this is my 'public declaration notification' record to the community. Thanks!

shane
2018-10-10 13:55
This is the official $welcome notice for the alter ego @cat

2018-10-10 13:55
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

cat
2018-10-10 13:58
This is confirmation of ctrees public statement. Yea, this sort of things have keep me out of a lot of worthless legal meetings....

cat
2018-10-10 14:05
I'll post up the DevSecOps stuff to it's own github and post here.

cat
2018-10-10 14:12
so... a setup question... On a macmini I did the drp install... then did the vbox 'magic' that greg showed a few days ago... let drp UI find the vbox ccatmini:drp cat$ sudo ./dr-provision --base-root=`pwd`/drp-data --local-content="" --default-content="" Password: dr-provision2018/10/10 14:11:14.829414 Version: v3.11.0-tip-16-7fd6f424065191df261191bc72c85bc189b70100 dr-provision2018/10/10 14:11:14.829579 Extracting Default Assets dr-provision2018/10/10 14:11:14.829585 Extracting Default Assets dr-provision2018/10/10 14:11:15.637542 Starting metrics server dr-provision2018/10/10 14:11:15.761149 Starting TFTP server dr-provision2018/10/10 14:11:15.761429 Starting static file server dr-provision2018/10/10 14:11:15.761629 Starting DHCP server dr-provision2018/10/10 14:11:15.762308 Starting PXE/BINL server dr-provision2018/10/10 14:11:15.762661 Starting API server dr-provision2018/10/10 14:11:16.046866 [3:1]virtualbox-ipmi:virtualbox-ipmi [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/messaging_client.go:71 [3:1]Stop failed: Post http://unix/api-plugin/v3/stop: EOF dr-provision2018/10/10 14:11:16.132084 [16:2]virtualbox-ipmi:virtualbox-ipmi [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/messaging_client.go:71 [16:2]Stop failed: Post http://unix/api-plugin/v3/stop: EOF

cat
2018-10-10 14:13
Should I be specifing the IP for DRP ?

cat
2018-10-10 14:15
(I am attempting to setup Greg's dev env with vbox)

zehicle
2018-10-10 14:38
BARE METAL CLOUD > we just added a new RackN licensed pooling plugin that extends the actions API so that you can request/release machines from a single API/CLI call. That includes being able to set params for images and start workflows. If you've seen our terraform plugin, this basically moves that functionality (and some extras!) into the API so that it can be used by any client without coding.

akbhat
2018-10-10 14:39
Are there any examples illustrating the use of the Terraform plugin?

shane
2018-10-10 14:41
@akbhat - yes, see the Youtube channel - there are a couple of videos with examples



zehicle
2018-10-10 14:44
@akbhat as a bonus... we have examples of installing Kubernetes via Terraform using that plugin in the KRIB videos list

zehicle
2018-10-10 14:45
which takes under 8 minutes INCLUDING the Packet provisioning time

akbhat
2018-10-10 14:45
thank you gentlemen

zehicle
2018-10-10 14:46
@akbhat please note that we merged the Contrail stage into KRIB tip last night.

cat
2018-10-10 14:47
SO... what did I miss ? (iPXE a machine from vbox)... [974:5]Incoming iPXE does not support bzImage dr-provision2018/10/10 14:42:57.903812 [977:6]dhcp:dhcp [ warn]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/pxe.go:70

cat
2018-10-10 14:57
humm... is this the 'dhcp option 67' thing ?

greg
2018-10-10 15:22
it is a warning that your ipxe is not powerful enough. and will be sent a new one.

greg
2018-10-10 15:22
it is not fatal.

cat
2018-10-10 15:23
OH... 'I think' this is the vbox ipxe is the issue (I was attempting a 'BareMetal' vm on vbox)... I think I just need to figure out the workflow that starts up a vbox vm using the VirtualBox IPMI

greg
2018-10-10 15:28
On virtualbox, I?ve found that I have to power off and power on the vm to work correctly with the ipxe chaining after the first time.

cat
2018-10-10 15:29
Ok... just the vm or restart vbox ?

greg
2018-10-10 15:36
just the vm

cat
2018-10-10 15:46
I have drp-provision running on the mac without --static-ip #

cat
2018-10-10 15:48
I noticed Ben maybe fixed his issue by using --static-ip #... would that matter ? It looks as if drp is answering the dhcp request in the logs... but ??

greg
2018-10-10 15:49
You need to make sure that the mac has a route for broadcasting.

greg
2018-10-10 15:50
something like: `sudo route add 255.255.255.255 $IPADDR`

greg
2018-10-10 15:50
where $IPADDR is the hostonly network ip address on the mac for that network.

cat
2018-10-10 15:50
I did... but checking again...

greg
2018-10-10 15:50
hmm okay

greg
2018-10-10 15:51
you can turn up dhcp debug logging to see if DRP is getting the requests.

greg
2018-10-10 15:52
Also - make sure that you start DRP after the vboxnet0 is up and present.

greg
2018-10-10 15:52
DRP should try and handle it, but it may not.

cat
2018-10-10 15:53
crap... you were right... my stupid it WAS route add...

cat
2018-10-10 15:54
I did it but in my 'session last night'...

cat
2018-10-10 15:56
thanks....

greg
2018-10-10 16:00
cool

cat
2018-10-10 16:24
sledgehammer seems to get stuck booting... the vm console shows %3... the lease appears, but no machine...

cat
2018-10-10 16:26
I see a 'virtualbox-discover' in the stages... should I be using that ?

cat
2018-10-10 16:27
catmini:drp cat$ ./drpcli prefs list { "baseTokenSecret": "Ekwrct_OrX-FJxPai0vVbqpkb2OvX7Uz", "debugBootEnv": "warn", "debugDhcp": "warn", "debugFrontend": "warn", "debugPlugins": "warn", "debugRenderer": "warn", "defaultBootEnv": "sledgehammer", "defaultStage": "discover", "defaultWorkflow": "discovery", "knownTokenTimeout": "3600", "logLevel": "warn", "systemGrantorSecret": "NHQ32w3MDy1N4Tuu2Hi7GGBbLxmVA4RV", "unknownBootEnv": "discovery", "unknownTokenTimeout": "600" }

cat
2018-10-10 16:27
catmini:drp cat$ ./drpcli workflows list [ { "Available": true, "Description": "", "Documentation": "", "Errors": [], "Meta": {}, "Name": "centos7", "ReadOnly": false, "Stages": [ "centos-7-install", "complete" ], "Validated": true }, { "Available": true, "Description": "", "Documentation": "", "Errors": [], "Meta": {}, "Name": "discovery", "ReadOnly": false, "Stages": [ "discover", "sledgehammer-wait" ], "Validated": true } ]

cat
2018-10-10 16:28
I do have 'sledgehammer-wait' in the 'discovery' (aka following the quick-start)

cat
2018-10-10 16:51
I am using tip...

cat
2018-10-10 16:55
Version tip v3.11.0-tip-16-7fd6f424065191df261191bc72c85bc189b70100 Feature Flags api-v3, sane-exit-codes, common-blob-size, change-stage-map, job-exit-states, package-repository-handling, profileless-machine, threaded-log-levels, plugin-v2, fsm-runner, plugin-v2-safe-config, workflows, default-workflow, http-range-header, roles, tenants, secure-params, separate-meta-api, slim-objects, secure-param-upgrade, sprig Endpoint MAC Address and API Port 3c:07:54:72:49:e2, 8092 OS and Architecture darwin amd64

cat
2018-10-10 17:25
... so THINKING maybe it was the image... I hacked and replaced with the centos pxe: catmini:drp-data cat$ sudo cp ~/test/vmlinuz-centos751804 tftpboot/sledgehammer/a6ea7c254acdbf7050fcd1b2d756471ceb4b2b1a/vmlinuz0 catmini:drp-data cat$ sudo ls -alu tftpboot/sledgehammer/a6ea7c254acdbf7050fcd1b2d756471ceb4b2b1a total 417968 drwx------+ 8 root staff 256 Oct 10 12:21 . drwx------+ 3 root staff 96 Oct 10 12:18 .. -rw-------+ 1 root staff 0 Oct 10 11:40 .sledgehammer_a6ea7c254acdbf7050fcd1b2d756471ceb4b2b1a.rebar_canary -rw-r--r--+ 1 root wheel 183549952 Oct 9 20:40 root.squashfs -rw-r--r--+ 1 root wheel 160 Oct 10 11:40 sha1sums -rw-r--r--+ 1 root wheel 17814036 Oct 9 20:40 stage1.img -rwxr-xr-x+ 1 root staff 6224704 Oct 10 12:21 vmlinuz0 -rwxr-xr-x+ 1 root wheel 6398144 Oct 10 12:19 vmlinuz0-org catmini:drp-data cat$

cat
2018-10-10 17:26
the only difference was it stalled on 4% instead of 3%

greg
2018-10-10 17:32
those are normal fall-throughs for unknown machines. Not sure why the system is hanging at 3-4%.

ctrees
2018-10-10 17:38
my 'guess' is vm vbox networking... what version of vbox ru on... Mine VirtualBox Graphical User Interface Version 5.2.4 r119785 (Qt5.6.3) Copyright 2017 Oracle Corporation and/or its affiliat

cat
2018-10-10 17:42
woops my alter ego got in the way...

cat
2018-10-10 17:42
I saw this:

cat
2018-10-10 17:42
modifyArgs := []string{"modifyvm", m.Name, "--ioapic", "on", "--cpus", fmt.Sprintf("%d", cpuCount), "--boot1", "net", "--boot2", "disk", "--boot3", "none", "--boot4", "none", "--memory", fmt.Sprintf("%d", memSizeInMB), "--vram", fmt.Sprintf("%d", vramSizeInMB), "--nic1", "hostonly", "--hostonlyadapter1", "vboxnet0", "--nictype1", "82545EM", "--nic2", "nat", "--nictype2", "82545EM"}

greg
2018-10-10 17:42
oh - it must be named vboxnet0. ?. sorry

cat
2018-10-10 17:44
yea I got the vboxnet0... but maybe nic and pro mode ?

cat
2018-10-10 17:46
catmini:drp cat$ ./drpcli subnets list [ { "ActiveEnd": "192.168.56.254", "ActiveLeaseTime": 60, "ActiveStart": "192.168.56.10", "Available": true, "Description": "", "Documentation": "", "Enabled": true, "Errors": [], "Meta": {}, "Name": "vboxnet0", "NextServer": "", "OnlyReservations": false, "Options": [ { "Code": 3, "Value": "192.168.56.1" }, { "Code": 6, "Value": "192.168.56.1" }, { "Code": 15, "Value": "cat9.private" }, { "Code": 1, "Value": "255.255.255.0" }, { "Code": 28, "Value": "192.168.56.255" } ], "Pickers": [ "hint", "nextFree", "mostExpired" ], "Proxy": false, "ReadOnly": false, "ReservedLeaseTime": 7200, "Strategy": "MAC", "Subnet": "192.168.56.1/24", "Unmanaged": false, "Validated": true } ]

cat
2018-10-10 17:48
Was the NIC type...

cat
2018-10-10 17:49
I sort of rember that from a previous project years ago... (the default vb nic didn't work)

cat
2018-10-10 17:54
YEA... got a machine :wink:....

cat
2018-10-10 17:54
catmini:drp cat$ ./drpcli machines list [ { "Address": "192.168.56.10", "Available": true, "BootEnv": "sledgehammer", "CurrentJob": "08a7ce6e-0da1-46f4-8f9d-c2651276da2c", "CurrentTask": 5, "Description": "", "Errors": [], "HardwareAddrs": [ "08:00:27:5e:b7:e8"

greg
2018-10-10 18:03
:slightly_smiling_face:

tom.gillman
2018-10-10 18:04
Just to verify. debug-fs is mounted in sledgehammer? I see it when i do `systemctl` ```. . . dev-mqueue.mount loaded active mounted POSIX Message Queue proc-sys-fs-binfmt_misc.mount loaded active mounted Arbitrary Execut run-user-0.mount loaded active mounted /run/user/0 sys-kernel-config.mount loaded active mounted Configuration File S sys-kernel-debug.mount loaded active mounted Debug File System var-lib-nfs-rpc_pipefs.mount loaded active mounted RPC Pipe File Sys . . . ``` but I want to make sure it wasn't me screwing around.

tom.gillman
2018-10-10 18:07
nm. Found it in systemd config.

greg
2018-10-10 18:08
hmm - my sledgehammer doesn?t have debug

greg
2018-10-10 18:08
nm - yes it does. Silly me.

greg
2018-10-10 18:08
bad grep

zehicle
2018-10-10 20:35
wants to encourage people in channel to use the code/snippet feature for long posts

shane
2018-10-10 20:39
please ... please ... please ^^^^^^^

cat
2018-10-10 20:51
sorry... says the current offender...

ben.le
2018-10-11 17:15
@shane regarding the DHCP enable on DR Provision; i already corrected the subnet and verified the port 67 and 69 are working properly, but the client can?t pick up the IP address from DRP server or any further, so i created regular Linux?s DHCP server to test and the client (bare metal without OS) can receive the IP address without any problems. Any ideas how to resolve the issue? how to monitor the DHCP?s log on DRP?

ben.le
2018-10-11 17:19
the journalctl doesn?t show much information about DHCP communicated to the client

vlowther
2018-10-11 18:05
@ben.le Crank the DHCP logging level up to debug in dr-provison, that will include full dumps of inbound and outbound DHCP packets.

shane
2018-10-11 18:20
@ben.le - are you running DRP in a container ?

ben.le
2018-10-11 18:21
the DHCP logging level was set ?debug?

ben.le
2018-10-11 18:21
my DRP is running on VM

shane
2018-10-11 18:29
can you plz drop `tcpdump` or similar sniffer on the DRP endpoint - and do a quick packet capture to make sure you're seeing DHCP packets on the VM NIC? example: `tcpdump -i eth0 -n port 67 and port 68` (set your interface appropriately)

ben.le
2018-10-11 18:36
# tcpdump -i eth0 -n port 67 and port 68 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes 11:33:52.931000 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:9a:07:ff, length 548 11:33:54.963827 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:9a:07:ff, length 548 11:33:58.974252 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:9a:07:ff, length 548 11:34:06.994824 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 00:50:56:9a:07:ff, length 548

ben.le
2018-10-11 18:37
the MAC address come from the testing client VM, but i don?t see the bare metal?s MAC address

shane
2018-10-11 18:43
what's the underlying hypervisor ? is this a Mac w/ VirtualBox ... Linux with KVM ... ESXi ?

ben.le
2018-10-11 18:43
VMware ESXi

shane
2018-10-11 18:45
Ok - you're seeing the DHCP client request - but no DHCP server response - are you sure the Client IP address/subnet is a subnet that has been defined on the DRP Endpoint ? We do not respond to DHCP packets if we do not have a Subnet defined that matches the client request.

ben.le
2018-10-11 18:48
all the VMs are on the same subnet (10.29.123.0/24)

ben.le
2018-10-11 18:49
}, ?DHCP?: { ?Alive?: true, ?Enabled?: true, ?Port?: 67

ben.le
2018-10-11 18:50
# drpcli subnets show local_subnet { ?ActiveEnd?: ?10.29.123.70", ?ActiveLeaseTime?: 60, ?ActiveStart?: ?10.29.123.50?, ?Available?: true, ?Description?: ??, ?Documentation?: ??, ?Enabled?: true, ?Errors?: [], ?Meta?: {}, ?Name?: ?local_subnet?, ?NextServer?: ??, ?OnlyReservations?: false, ?Options?: [ { ?Code?: 3, ?Value?: ?10.29.123.254" }, { ?Code?: 6, ?Value?: ?10.29.200.30? }, { ?Code?: 15, ?Value?: ?http://cso.fireeye.com? }, { ?Code?: 1, ?Value?: ?255.255.255.0? }, { ?Code?: 28, ?Value?: ?10.29.123.255" } ], ?Pickers?: [ ?hint?, ?nextFree?, ?mostExpired? ], ?Proxy?: false, ?ReadOnly?: false, ?ReservedLeaseTime?: 7200, ?Strategy?: ?MAC?, ?Subnet?: ?10.29.123.0/24?, ?Unmanaged?: false, ?Validated?: true }

cat
2018-10-11 20:37
I had the same type of issue, but with Vbox, the solution was to switch the virtual NIC card... I do remember having to do something with VMWare (switching NIC to make DHCP function) but that was at least 3 years ago

cat
2018-10-11 20:38
It was also on VMWare Fusion

ben.le
2018-10-11 21:03
thanks @cat, i will try it and see how that works

cat
2018-10-11 22:19

zehicle
2018-10-12 02:56
video tour of the new MACHINE POOL plugin > https://youtu.be/UPawotXB_Qs

zehicle
2018-10-12 02:56
this is beta functionality... but OMG it's cool

d.schrimpsher
2018-10-12 16:33
So Param question. I was going to do straight machine params but the more I look at it the more I want to make params then attach them to machines. However, the more I read the documentation, it seems like I need to make

d.schrimpsher
2018-10-12 16:33
But I am a little fuzzy on the attach profile part. Do I have to create the machine with the profile already?

d.schrimpsher
2018-10-12 16:34
or can I just attach existing params to the machines directly

d.schrimpsher
2018-10-12 16:36
The document talks about embedded Profile and Param Maps. Is that the same object as /profiles?

d.schrimpsher
2018-10-12 16:37
and machine Profile vs Profiles

d.schrimpsher
2018-10-12 16:37
while you are at it

greg
2018-10-12 16:47
okay - soooo - yes. :slightly_smiling_face:

greg
2018-10-12 16:47
Parameters should be thought about as type definitions.

greg
2018-10-12 16:48
Those ?types? can be set on profiles or machines directory.

greg
2018-10-12 16:48
machines can also have profiles that mean they have those parameter values as well.

greg
2018-10-12 16:48
machines can be created with or without profiles and modified later.

greg
2018-10-12 16:49
@d.schrimpsher - does that make sense?

d.schrimpsher
2018-10-12 16:57
So I create parameters. Then I update the machine with those parameters.

d.schrimpsher
2018-10-12 16:57
So if I create a paramater, say {?Name?: ?Fluffy?}

greg
2018-10-12 16:57
I wasn?t clear.

d.schrimpsher
2018-10-12 16:58
I add it to a machine by POST /machine/uuid/params with {?Name? ?Fluffy?}

greg
2018-10-12 16:58
parameter `{ "Name": "Fluffy", "Schema": { "type": "string" } }`

d.schrimpsher
2018-10-12 16:58
well yeah

greg
2018-10-12 16:58
I have a parameter named fluffy of type string.

greg
2018-10-12 16:59
I can now set Fluffy with type checking on a machine, profile, or plugin.

d.schrimpsher
2018-10-12 16:59
oh

d.schrimpsher
2018-10-12 16:59
so the value is set when I apply it to the machine

greg
2018-10-12 16:59
yes.

d.schrimpsher
2018-10-12 16:59
oh so type def I get that now

greg
2018-10-12 16:59
We allow untyped parameters on those objects

greg
2018-10-12 17:00
but the safety around data validation is HUGE with parameters. It keeps you from doing bad things.

d.schrimpsher
2018-10-12 17:01
so `POST /machine/uuid/params with {?Fluffy?: ?Orange?} or what that is kinda where I am getting stuck

greg
2018-10-12 17:01
You can always do this: `drpcli machines set Name:router params jj to greg`

d.schrimpsher
2018-10-12 17:01
no I can?t

d.schrimpsher
2018-10-12 17:01
my entire project is around using the api

greg
2018-10-12 17:01
there is an API behind that.

d.schrimpsher
2018-10-12 17:01
I assume but it has a format

d.schrimpsher
2018-10-12 17:02
which I am not clear on

greg
2018-10-12 17:02
`api/v3/machines/<id>/params/<name or parameter>` POST with json blob to become value.

d.schrimpsher
2018-10-12 17:02
oh

shane
2018-10-12 17:03
@d.schrimpsher - have you seen the Swagger API documentation on your DRP endpoint ?

greg
2018-10-12 17:03
Repalce - my 127.0.0.1 with your DRP endpoint - https://127.0.0.1:8092/swagger-ui/#!/Machines/postMachineParam

d.schrimpsher
2018-10-12 17:03
yes

d.schrimpsher
2018-10-12 17:03
and it says {} for the body

d.schrimpsher
2018-10-12 17:03
which is not super helpful

greg
2018-10-12 17:03
Any json object is stored.

greg
2018-10-12 17:04
So ?string? would get stored as a string

greg
2018-10-12 17:04
or `"string"`

greg
2018-10-12 17:04
or `53`

greg
2018-10-12 17:04
a number

d.schrimpsher
2018-10-12 17:04
but accepting any string and understanding how it pust together so your machien can use it are two different things

shane
2018-10-12 17:04
Cool - one trick I often use to figure out the API calls - use the Developer Tools on the web browser via the Portal, and you'll see the backend API calls the Portal makes to the DRP endpoint

greg
2018-10-12 17:04
or `{ "i1": "o1", "i2": "o2"}` would be a deep object

shane
2018-10-12 17:04
that provides real world examples of exactly what's needed

d.schrimpsher
2018-10-12 17:05
thats true. I have been looking at the existing params but the machines dont have any

greg
2018-10-12 17:05
and that is why we have parameters. To define the object schema, and validation. :slightly_smiling_face:

d.schrimpsher
2018-10-12 17:07
So if I did `POST /machine/uuid/params/fluffy` with `Orange` that would work right

d.schrimpsher
2018-10-12 17:07
if it was a string

d.schrimpsher
2018-10-12 17:08
Or does it have to be in a JSON object?

greg
2018-10-12 17:09
it has to be a json object - so include the double quotes.

greg
2018-10-12 17:09
`"Orange"`

d.schrimpsher
2018-10-12 17:09
but no {} or [] needed?

d.schrimpsher
2018-10-12 17:09
right

greg
2018-10-12 17:09
correct

d.schrimpsher
2018-10-12 17:09
ok

d.schrimpsher
2018-10-12 17:10
I think that gets me going

d.schrimpsher
2018-10-12 17:10
thanks

greg
2018-10-12 17:10
cool

d.schrimpsher
2018-10-12 17:14
yeap got one going. Okay now to generalize

vlowther
2018-10-12 17:36
Technically, the `POST /machine/<uuid>/params/<name>` accepts valid JSON, not a JSON object.

vlowther
2018-10-12 17:37
a JSON object specifically refers to `{"this": "kind", "of":"thing}`

vlowther
2018-10-12 17:39
valid JSON is either a `"quoted string"`, `false`, `true`, `null`, `3.14159`, `["a",["list"], {"of":"things"},null, 1]` or `{"an":"object"}`

vlowther
2018-10-12 17:39
checked against the schema of that param (if there is one), of course.

d.schrimpsher
2018-10-12 18:00
awesome.

d.schrimpsher
2018-10-12 18:00
So I went through the process and the params I get back on the machine is

d.schrimpsher
2018-10-12 18:01
where the param is named ?mgmt? does that look right

greg
2018-10-12 18:02
no

d.schrimpsher
2018-10-12 18:03
thats good cause thats weird

greg
2018-10-12 18:03
I?d have expected `{ "mgmt": "10.197.111.12" }`

d.schrimpsher
2018-10-12 18:06
woops my fault ignore that I swap name and value

greg
2018-10-12 18:06
And Victor is right . I meant object as any of the valid json types and not limited to the specific json object. The API calls takes json encoded text that can be any JSON type. Thanks for clarifying Victor.

d.schrimpsher
2018-10-12 18:07
works thanks guys

d.schrimpsher
2018-10-12 18:15
Okay I give up. Does the subnet name have some weird requirements I am not aware of.

d.schrimpsher
2018-10-12 18:15
I keep getting violates the unique index Subnet

d.schrimpsher
2018-10-12 18:16
But there isn?t a subnet with the name Management_SUBNET in the system

d.schrimpsher
2018-10-12 18:17
or am I missunderstanding the message

d.schrimpsher
2018-10-12 18:17
nevermind

d.schrimpsher
2018-10-12 18:18
i had the same subnet address in there somewhere

b.quan
2018-10-12 20:32
Have you seen this issue? When installing a content pack with a workflow referring to a default stage, it says stage not available (but it actually existed):

b.quan
2018-10-12 20:34
It does not always happen since we also refer to other default stages as well. Another observation: it does not happen if we use python to call drpcli, but happens when we use ansible to call drpcli.

zehicle
2018-10-12 20:40
if your system has internet access, then DRP can get those pieces for you.

zehicle
2018-10-12 20:40
if not, you have to upload the isos

b.quan
2018-10-12 20:49
@zehicle It does have internet access, is it due to the delay for stage to be ready (due to the time taken to get those pieces)?

b.quan
2018-10-12 20:50
If I destroy and upload the same content pack again, it works

shane
2018-10-12 20:55
Is the destroy/reload 100% consistent - or just happened one or twice? I ask - because the public mirrors are often quite flakey. Would suggest uploading the ISO to be sure those artifacts are available - so you don't have to deal with the vagaries of the public mirrors for marking the content available if that is the issue.

b.quan
2018-10-12 21:50
Yes it's consistent/repeatable with Ansible. So the stage does need to wait for isos to be downloaded to be available?

zehicle
2018-10-12 22:30
Yes

zehicle
2018-10-12 22:31
You can check the bootenv or stage, there's a property on then to tell you they are available

zehicle
2018-10-12 22:31
If you store the iso locally, you can avoid the internet

b.quan
2018-10-13 01:13
cool, Thanks Rob!

shane
2018-10-13 20:08
... And just like that, it disappears....

zehicle
2018-10-13 20:15
was a work in process... soon.

zehicle
2018-10-13 20:15
be patient, padawan

zehicle
2018-10-13 20:16
that's for me

greg
2018-10-14 00:38
- Hi All, I?ve update tip to include arch support. If you update to tip, you will to get tip everything. Just to be safe.

greg
2018-10-14 00:39
You should be able to try ARM support. it will likely have issues, but something to look at.

b.quan
2018-10-14 04:51
What's an example of isoUrl (for bootenv) for a local iso file? "file://<iso-file-location>" ?

greg
2018-10-14 11:55
Just put it in the tftpboot/isos directory.

b.quan
2018-10-14 13:09
so we don't need to change the IsoUrl for the ubuntu-16.04-install bootenv setting? The default is: "IsoUrl": "http://mirror.math.princeton.edu/pub/ubuntu-iso/16.04/ubuntu-16.04.5-server-amd64.iso",

greg
2018-10-14 13:16
That is a hint on where to get it if you don?t have it in your tftpboot/isos directory. The uploadiso command uses it

b.quan
2018-10-14 14:08
cool. Thanks Greg! One more question: does drp support cisco servers, like C240?

shane
2018-10-14 15:11
@b.quan what do you mean by "supported"? We do have customers that successfully use DRP with Cisco servers. The base IPMI plugin supports next-boot and power on functions ...

b.quan
2018-10-14 15:29
@shane exact info that I need. I just wanted to know that if anyone has success using DRP with Cisco servers, as we have members with Cisco servers.

bagricola
2018-10-14 23:11
Hmm? getting this ever since upgrading to tip drp + contents when trying to boot a host. No TFTP requests from the host give any return traffic and that?s the only error message given with all logging set to trace. DRP is not installed in container. This host booted fine off DRP with the old version and no config has changed so should work?

zehicle
2018-10-14 23:50
@bagricola which tip version?

zehicle
2018-10-14 23:50
we just dropped a new tip that requires new contents / sledgehammer

zehicle
2018-10-14 23:50
you may want to check your tip content

zehicle
2018-10-14 23:52
and if you have new tip content, then make sure you have latest tip DRP too. there are architecture awareness changes in tip

bagricola
2018-10-14 23:59
yeah, aware of that. Deployed `v3.11.0-tip-39-253af96fff2ba4069249d8e98d53bee7be65207d` from the github release (dr-provision) and also tip for all the content files about 15 mins before I posted to error above, so pretty sure it?s the absolute latest code. Shows the multiarch flag, and this was a completely new install (I deleted /var/lib/dr-provision and all settings, iso uploads, content packs etc were autoinstalled via ansible)

bagricola
2018-10-15 00:00
1am though so I need to go to bed, but I?ll poke further tomorrow to try and track down what?s going on

zehicle
2018-10-15 00:38
it could be something related to the new changes

bagricola
2018-10-15 16:31
hmm, i?ve put some further debug logging to work out where it?s failing and nothing obvious sticks out yet. `lpxelinux.0` and `ldlinux.c32` are served fine, and then looks like pxe makes successive requests for less specific `pxelinux.cfg` files (starting with `pxelinux.cfg/44454c4c-5400-104c-8030-b2c04f595831`). None of those exist which I assume is fine as it?s a completely unknown server, so it ends up requesting the default (`pxelinux.cfg/default`) which is what panics over and over.

greg
2018-10-15 16:33
okay can you send me a logs and your content versions and drp versions and plugin versions. @bagricola - I suspect I broken something pulling in the arch stuff.

greg
2018-10-15 16:34
How are you serving DHCP? What bootfile are you using? If any?

bagricola
2018-10-15 16:41
DHCP is DRP internal, everything is direct, no proxying. Just realise i should be able to remove the defer in that tftp function to get a full stack trace for the panic

greg
2018-10-15 16:42
I was just wondering because I?m not sure we would hit `pxelinux.cfg/default` any more without a custom bootfile.

bagricola
2018-10-15 16:44
`BootEnv discovery: /var/lib/dr-provision/tftpboot/sledgehammer/958745f088207052846423e755093d2ac7c7986b slready exists` Just realised I?m getting this at startup?

greg
2018-10-15 16:44
That should be okay. I thnk that will happen all the time for bootenvs that are there.

greg
2018-10-15 16:50
okay - I see something, @bagricola - just a minute

bagricola
2018-10-15 16:51
This would be so much easier to test if this server would attempt PXE booting for more than 30s before it reboots and takes 5 mins to get back to PXE boot stage :face_with_raised_eyebrow:

greg
2018-10-15 16:52
Yeah - I think I found something.

shane
2018-10-15 16:52
no BIOS setting for retry boot method? some hardware does support a continual retry mechanism ...

bagricola
2018-10-15 16:52
it?s got this hilarious retry boot option

bagricola
2018-10-15 16:53
?hit any key to retry boot, or wait for reboot?

bagricola
2018-10-15 16:53
so you hit any key, and it reboots

bagricola
2018-10-15 16:53
i mean it?s not wrong :face_with_rolling_eyes:

bagricola
2018-10-15 16:54
argh, office being locked up so I?m gonna grab a quick beer on the way home and I?ll pick this back up in a bit :slightly_smiling_face: Thanks guys

shane
2018-10-15 16:54
grab one or two for me

greg
2018-10-15 16:55
@bagricola - beer on! Also, can you check the machine and see what Arch it thinks it has.

greg
2018-10-15 16:58
I also repro?ed it on my system.

bagricola
2018-10-15 16:59
The one being booted I assume? Will try work out how when I get back. It's just a 'standard' amd64 Dell server

2018-10-15 16:59
Time to feed the :bear:!

greg
2018-10-15 16:59
it is okay.

greg
2018-10-15 16:59
I think I see what is going on.

greg
2018-10-15 18:38
@bagricola -I?ve updated tip with a fix that should take care of what you are seeing.

greg
2018-10-15 18:39
You should just need to update drp.

bagricola
2018-10-15 18:43
@greg great, I'll give it a shot in a bit :+1:

greg
2018-10-15 18:51
cool

bagricola
2018-10-15 19:46
@greg it worked, cheers :slightly_smiling_face:

bagricola
2018-10-15 19:48
although lldp stage failed in discovery, but I?m not sure thats related:

shane
2018-10-15 20:07
@bagricola - maybe DNS resolution issue on your side (`could not resolve host`)? I was able to load that URL and receive a list of mirrors

bagricola
2018-10-15 20:14
yeah something?s changed i guess, but can?t login to sledgehammer to debug

bagricola
2018-10-15 20:14
thought it was using one of the default passwords but maybe not?

greg
2018-10-15 20:19
sledgehammer password is `rebar1` by default and is strangely hard to change.

shane
2018-10-15 20:19
(root user)

shane
2018-10-15 20:20
Also - suggest adding the `ssh-access` stage and adding your SSH keys for access to Sledgehammer


bagricola
2018-10-15 20:20
yeah good point, i dont have that in the discovery stage :face_palm:

greg
2018-10-15 20:22
`ssh-access` is part of the `discover` stage by default which is usually the first stage in a `discover` workflow.

bagricola
2018-10-15 20:22
This doesn?t look right :slightly_smiling_face:

greg
2018-10-15 20:22
well it does for me. :slightly_smiling_face:

bagricola
2018-10-15 20:22
I have a DNS server set in the subnet in DRP

bagricola
2018-10-15 20:22
haha

bagricola
2018-10-15 20:23
i thought it might

greg
2018-10-15 20:23
okay - have to think about that. It should have been overwritten, I think.

greg
2018-10-15 20:23
May need to delete that file as part of sledgehammer construction.

greg
2018-10-15 20:23
What file is it?

bagricola
2018-10-15 20:23
?/etc/resolv.conf?

greg
2018-10-15 20:26
sigh - my brain. Thx

bagricola
2018-10-15 20:27
`sledgehammer-start-up.sh` is logging the right DNS server so the dhcp part is working at least

bagricola
2018-10-15 21:48
weird? ran dhclient by hand and it updated resolv.conf correctly :face_with_raised_eyebrow:

greg
2018-10-16 00:02
Hmmm. It hurt my head because it should fix it

bagricola
2018-10-16 08:18
Yeah, it doesn?t make any sense. Going to downgrade to 3.11 just to confirm something changed and its not just a local thing

bagricola
2018-10-16 08:25
yeah works straight away with 3.11

bagricola
2018-10-16 08:26
I did notice that `lldpd` is already installed in the sledgehammer image there though, so I wonder if I never ran into this before because it never attempted to *install* `lldpd` from sledgehammer and therefore never needed working dns :thinking_face:

greg
2018-10-16 12:56
okay - I?ll look into it

greg
2018-10-16 12:56
Thanks for the detective work

robert.graham
2018-10-16 17:47
Are there any documents, tutorials or blogs on how to provision Windows with DRP?

zehicle
2018-10-16 19:34
took this to a 1x1 channel - Window provisioning uses the RackN Image Deployer plugin

greg
2018-10-17 02:19
@bagricola - I have a new sledgehammer in tip content that might work better with the /etc/resolv.conf stuff. lldpd could be readded to sledgehammer, but should install through epel now.

greg
2018-10-17 02:20
v1.11.1-tip-24

slack1
2018-10-17 05:52
has joined #community201810

zehicle
2018-10-17 11:21
@slack1 $welcome

2018-10-17 11:21
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

bagricola
2018-10-17 14:04
@greg new sledgehammer in tip works great, cheers :slightly_smiling_face:

greg
2018-10-17 14:55
nice!

bagricola
2018-10-17 22:44
i noticed something while changing some custom stages earlier - I had a stage with a required param that didn?t exist (I deleted the param and forgot to delete the required entry from the stage). When I ran the workflow containing the stage, it would basically get stuck on the stage with the missing param, but didn?t seem to log anything and just seemed stuck. Took me an age to track that down? is there any logging I?m missing there, or does it wait silently for a particular reason?

shane
2018-10-17 23:00
@bagricola - validation for Params happens when an object (the Stage) is created - so the initial validation had passed, but we don't re-validate on run - which is why the "silent error" occurred. We'll need to take a look at this scenario and see how we can do better. I submitted a new issue to track this: https://github.com/digitalrebar/provision/issues/1008

greg
2018-10-17 23:21
@bagricola - parameter or profile?

greg
2018-10-17 23:21
Thanks for writing that up, @shane

bagricola
2018-10-18 14:16
sorry, parameter

bagricola
2018-10-18 14:17
profile didn?t change, i deleted the parameter cos didn?t need it anymore in one of the tasks, but forgot to remove it from the required list in the stage

bagricola
2018-10-18 14:17
(this is all in a compiled content file btw)

bagricola
2018-10-18 14:18
so it was imported as-one

bagricola
2018-10-18 14:18
not edited by hand

zehicle
2018-10-18 14:18
so the import did not catch the dependency miss? that sounds strange

zehicle
2018-10-18 14:18
was the parameter in the system from a prior import?

bagricola
2018-10-18 14:22
it would?ve been yeah

bagricola
2018-10-18 14:23
(my dev process is local in git, I commit the changes, tag and push, gitlab CI calls `contents bundle` and then I use ansible to push the bundled contents into DRP using `drpcli contents upload`

zehicle
2018-10-18 14:27
we've been moving towards using `drpcli contents update` instead of upload. that's invasive and gives better feedback

zehicle
2018-10-18 14:28
knows that color demo and other material still says upload....

bagricola
2018-10-18 14:46
cool, i?ll change my approach and see what explodes (if anything) :slightly_smiling_face:

bagricola
2018-10-18 14:49
ahh needs an ID, so initial deploy should still use `upload` yeah? then from that point on `update`?

zehicle
2018-10-18 14:50
it's the CREATE vs UPDATE problem

zehicle
2018-10-18 14:50
so the update will be more friendly after you've done the first upload

greg
2018-10-18 14:57
upload should do the math either way

greg
2018-10-18 14:58
ah - I think I now understand the issue.

greg
2018-10-18 14:58
You are trying to do the ?right? thing by keeping required and optional parameters update to date.

greg
2018-10-18 14:58
I?m not sure the delete code will do anything on that.

greg
2018-10-18 14:59
Have to think about it.

greg
2018-10-18 15:01
@bagricola - I think I now understand the problem better.

greg
2018-10-18 15:01
Not sure I?ll fix it right away.

bagricola
2018-10-18 15:02
haha okay

bagricola
2018-10-18 15:03
tbh wasn?t a big one for me, but it took a ridiculous amount of time to work out what it was because of the way it ?fails?

greg
2018-10-18 15:03
Yeah - I?ll try it and see what happens.

greg
2018-10-18 15:04
The general guidance is to place the required/optional parameters when possible with the stages and tasks that require them. I know that isn?t always the case, but ?

shane
2018-10-18 15:09
- we look forward to all of you joining us next Tuesday October 23rd at 11am PST for v028 meetup. Please see the meetup pages for details: https://www.meetup.com/digitalrebar/events/255615799

b.quan
2018-10-18 15:10
Related to content pack: we define some params in the content pack, but after uploading the content pack, we can see stages, templates, workflows, etc, but I don't see any params defined in the content pack from the UI. Any insights?

greg
2018-10-18 15:14
is there a params directory in the content package directory?

b.quan
2018-10-18 15:15
yes

greg
2018-10-18 15:16
`drpcli params show <name>`

greg
2018-10-18 15:16
does that show it.

b.quan
2018-10-18 15:16
yes

b.quan
2018-10-18 15:16
It's just they are not shown in the portal UI

greg
2018-10-18 15:16
Make sure to login in to the SaaS to get paging. If it is more than 20 down, you may need to scroll to a second or beyond page.

b.quan
2018-10-18 15:17
I only have default login, how do I get my own login?

greg
2018-10-18 15:18
It is the SaaS portal login in the upper right.

zehicle
2018-10-18 15:18
https://portal.rackn.io/#/user should take you there too

b.quan
2018-10-18 15:18
cool, will try that

greg
2018-10-18 15:22
@bagricola - where did you expect the errors to occur?

bagricola
2018-10-18 15:23
I assumed either an error during content import, or an error when the stage attempts to run

greg
2018-10-18 15:23
I created a stage with missing parameters in the required and optional fields.

bagricola
2018-10-18 15:23
i turned logging right up but couldn?t find anything

greg
2018-10-18 15:23
There is an error on the machine in my system.

greg
2018-10-18 15:23
The machine goes not runnable and the machine picks up an error explaining the problem.

bagricola
2018-10-18 15:24
hmm? I?m almost certain I didn?t see anything like that

greg
2018-10-18 15:24
okay - I?m trying it now as part of a workflow.

greg
2018-10-18 15:25
Yeah - I have that happening on my system.

bagricola
2018-10-18 15:25
basically what I got was a stage that had the yellow icon on the job log (I assume running)

bagricola
2018-10-18 15:25
and the log contents was something like

bagricola
2018-10-18 15:25
``` Log for Job: 8bb7598b-0439-4ef3-9903-85270d480a73 Machine 4fe28d45-0f99-4914-8322-339331ab1d34 changing from stage squiz-post-install to complete ```

greg
2018-10-18 15:25
hmmm

bagricola
2018-10-18 15:25
(except the stage names were different)

bagricola
2018-10-18 15:26
so the machine was showing as running, but not actually doing anything

greg
2018-10-18 15:27
hmm - I?m not seeing it - I may need the workflow, stage, task usage - we can do direct message it.

greg
2018-10-18 15:28
I get an error on the machine. We don?t generate log messages for this though. There is an event because of the error.

bagricola
2018-10-18 15:35
hm. I blatted the drp machine i saw it on to test the deployment process end to end, so I can?t find the specific change that caused it

bagricola
2018-10-18 15:36
it was either a missing param in the task or the stage

bagricola
2018-10-18 15:36
i can try to reproduce it but will have to wait until tomorrow :confused:

greg
2018-10-18 15:44
task may break differently than stage. I was only testing stage.

bagricola
2018-10-18 15:53
yeah? i thought it was stage because the task in question had an embedded template, I removed the ref to the param in the template and I?m pretty sure I removed the required param from the task at the same time, and it was the fact it was still in the stage that I missed

nkabir
2018-10-18 18:26
Is there documentation on what is allowed in a content bundle? Ideally, I'd like to bundle everything for a given network including lease reservations and machines. But 'dprcli contents bundle' doesn't seem to incorporate all of the subfolders (e.g. 'reservations'). Moreover, the api for reservations doesn't appear to support arrays so it seems that reservations must be created one at a time via the API.

shane
2018-10-18 18:28
@nkabir you should join us Tuesday at 11am PST for the meetup - we'll be talking about "bundlize/convert" usage which will relate to DHCP leases and machines

shane
2018-10-18 18:28
leases, reservations, and machines are handled differently than "content"

greg
2018-10-18 18:29
In general, you do NOT want machines and leases in a bundle. These are dynamic data and are written as state changes. This is not possible with content bundles.

greg
2018-10-18 18:29
Reservations should be able to be put in content bundles.

greg
2018-10-18 18:31
make sure that the directory is named: `reservations`

greg
2018-10-18 18:32
also, none of the API endpoints allow for bulk creates

greg
2018-10-18 18:32
it is an API design decision. (Potentially an alterable one, but ?)

nkabir
2018-10-18 18:37
What a nice coincidence :slightly_smiling_face: I do not think my schedule lines up but I will check out the recording once it's up. I understand actual leases should not be in a bundle--I will try with reservations again. What is "best practice" for managing the machine collection? My thought was that reservations and machines (not leases) are specified in a bundle for a network. That seems like a nice way to manage environments but I may be wrong. If this will be covered in today's discussion, I can wait. As far as the API goes, I'm coming from a REST-centric perspective where there are two canonical interactions: single and collection. If collections are possible with the API, it would make scripting easier. I have been working primarily with 'contents' so I was getting collection behavior implicitly.

greg
2018-10-18 18:42
Yes - that is the preferred way to deal with collections.

greg
2018-10-18 18:43
We didn?t want the API callers to deal with partial success on collections - it is simpler for all involved.

nkabir
2018-10-18 18:43
That makes sense. Partial success is a pain.

greg
2018-10-18 18:43
Soooo , there is a new command that we cover tomorrow. called convert.

greg
2018-10-18 18:44
This is what you want for bulk insert of machines.

nkabir
2018-10-18 18:44
Ah. Great!

greg
2018-10-18 18:44
This takes a yaml file and injects it as read/write objects in the writable store layer.

greg
2018-10-18 18:44
So, you can discover your network and then us `bundlize` to extract all the machines you want.

greg
2018-10-18 18:45
Then you can use `convert` to inject that yaml as read/write objects.

greg
2018-10-18 18:45
It loses its content-ness, but it works.

nkabir
2018-10-18 18:46
From a slightly different perspective, you envision a folder containing yaml documents corresponding to machines that are posted one by one to declare a network environment? (As an alternative to network discovery)

greg
2018-10-18 18:48
The more likely case I?ve seen is an excel spreadsheet from an SI documenting macs and other info. A script that process that spreadsheet into machine objects into the environment. But that would work too.

greg
2018-10-18 18:49
The thing is I don?t understand your machine to network affiliation. Machines aren?t necessarily associated that way.

nkabir
2018-10-18 18:53
My approach may be flawed. I've approached organization of our environments by defining a network (e.g. http://kzau.example.com) and then declare machines that exist in that environment (e.g. http://lsv-0.kzau.example.com, http://lsv-1.kzau.example.com). I typically know which machines will be deployed and haven't needed to perform a network scan. Is there a better way?

greg
2018-10-18 18:54
hmm - not necessarily better. It more depends upon what you are trying to do.

greg
2018-10-18 18:55
If you have a static constant set of machines that you want to import and then make sure always get the same set of existing info, then your way is reasonable.

greg
2018-10-18 18:55
As a starting point.

greg
2018-10-18 18:55
I?m not sure I would be reimporting that file after initial use.

greg
2018-10-18 18:56
One of the things I?ve been thinking about writing is a script to ?import? an existing machine into DRP.

greg
2018-10-18 18:56
That way you could run it on the machines in question. It would be like network discovery, but run locally. That would also let you attach additional info for your own use.

greg
2018-10-18 18:56
So, really it would be a shell with gaps for addition, but ?

nkabir
2018-10-18 18:57
Yes--it's a constant set per environment. Essentially, we define a subnet with a recipe of machines to handle known compute, storage, and indexing workloads. I haven't had a need for "discovery" yet since our environments are well-defined. I thought it would be simpler to avoid discovery.

nkabir
2018-10-18 18:58
Based on your videos, you guys deal with much larger fleets!

greg
2018-10-18 19:00
if you are going to install them, you might as well go through discovery, you?ll mostly be there anyway. The reservations of MAC to IP will make sure the IP stays the same. Then discovery would just gather the info. The ?neat? thing is that discovery by default is not destructive. So, you could setup the reservations, PXE boot a machine, and see that it gets added to DRP. You could then change the workflow to complete or something and reboot the machine back to disk. Viola, discovered machines but back to what they were doing.

greg
2018-10-18 19:01
If you have the IPMI plugin, it could discover the OOB BMC address as well, but that is more advanced.

nkabir
2018-10-18 19:05
Ah--I see. That makes sense. How are profiles associated with the discovered machines? My original thought was that I would assign those profiles in the machine definitions. Ideally, I'd like to define a network environment and have the machines in ready state without intervention. Then run Ansible against the collection.

greg
2018-10-18 19:06
So - ssh-keys can be a define globally or in a profile.

nkabir
2018-10-18 19:06
I'd prefer a profile and avoid global if possible.

greg
2018-10-18 19:07
There is a package for classification to automatically assign profiles to machines based upon network. See the `classification` package. It may be RackN content, but you should be able to play with it for awhile (then talk to us about size - it may not be an issue).

nkabir
2018-10-18 19:08
That's the missing step! Thanks--I will check that out!

greg
2018-10-18 19:08
Same with `inventory` for building a list of parameters that describe the machine.

nkabir
2018-10-18 19:11

greg
2018-10-18 19:13
np!

dave.parker
2018-10-18 19:39
Hey folks

shane
2018-10-18 19:40
Parker in duh houzzzzzz!

dave.parker
2018-10-18 19:40
I have a question about the python swagger api client. I created it with swagger-codegen and got it working, except I can't seem to figure out how the Patch object works.

dave.parker
2018-10-18 19:44
The docs for patch_machine say "Update a Machine specified by {uuid} using a RFC6902 Patch structure" And the example shows creating an empty Patch object with `body = drp_swagger.Patch()`

dave.parker
2018-10-18 19:44
But I can't figure out how to get my Patch info into that object? The object doesn't seem to have any attributes I can set.

dave.parker
2018-10-18 19:46
Like when I do a dir() on the Machine object I get this: ```['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'address', 'attribute_map', 'available', 'boot_env', 'current_job', 'current_task', 'description', 'errors', 'hardware_addrs', 'name', 'os', 'params', 'profile', 'profiles', 'read_only', 'runnable', 'secret', 'stage', 'swagger_types', 'tasks', 'to_dict', 'to_str', 'uuid', 'validated', 'workflow']``` Which shows me the stuff I have to set, like name and uuid and params and such.

dave.parker
2018-10-18 19:47
But Patch looks like this: ```['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'attribute_map', 'swagger_types', 'to_dict', 'to_str']```

dave.parker
2018-10-18 19:47
I can't find any examples on how to use this.

greg
2018-10-18 19:47
yes

greg
2018-10-18 19:48
So , you need a library that takes two objects and generates a RFC6902 patch structure.

dave.parker
2018-10-18 19:48
Ok

greg
2018-10-18 19:48
The starting object and the new object. It generates an array of patch actions.

greg
2018-10-18 19:48
give me a second.

dave.parker
2018-10-18 19:48
No problem.


greg
2018-10-18 19:50
While go - it can give the idea.

dave.parker
2018-10-18 19:50
Ok

greg
2018-10-18 19:50
The first structure you come to is the patch operation which sent as an array of operations in the PATCH method.

greg
2018-10-18 19:50
The `test` action is useful for generating atomic operations.


greg
2018-10-18 19:52
This line: `patch = jsonpatch.JsonPatch.from_diff(src, dst)`

greg
2018-10-18 19:52
Then send `patch` as a json object on the PATCH call.

dave.parker
2018-10-18 19:52
Ah, ok.

greg
2018-10-18 19:52
now, that library may not generate tests.

greg
2018-10-18 19:53
If you need atomic operations, you need to alter the patch to start with test operations.

greg
2018-10-18 19:53
That is what the drpcli add and remove parameter calls do.

greg
2018-10-18 19:54
Also the drpcli uses patch as well. It also takes a flag `--ref` which allows you to send in the ref object.

greg
2018-10-18 19:54
That is used to generate the patch.

greg
2018-10-18 19:55
It allows you to do a get, manipulate the object, and do an update call with the original ref object. It will generate PATCH and send that to server. That way you can be sure your object update is appropriate for the original object.

greg
2018-10-18 19:55
Anyway, let me know if that is clear as mud. :slightly_smiling_face:

dave.parker
2018-10-18 19:56
I think I groked about half of it... :slightly_smiling_face: But let me go over it all and I think I can puzzle it out. If not I'll be back. Thanks!

greg
2018-10-18 19:58
To be fair, this is one of the most advanced features of our API. It allows you to use the DRP objects as atomic data. With this basic feature, you can build safe allocation systems, leader election, and other things.

dave.parker
2018-10-18 19:59
Hehehe

greg
2018-10-18 20:01
soooo - what do you want to use it for?

greg
2018-10-18 20:01
:slightly_smiling_face:

dave.parker
2018-10-18 20:06
Really all I want to do is be able to set workflows and params and profiles and such on machines in python. I was actually using the python requests library to do it, but was trying to use the generated swagger library instead since it seemed like it'd be easier...

dave.parker
2018-10-18 20:06
Little did I know. :smile:

greg
2018-10-18 20:14
Oh for that you could do a Put of the object.

greg
2018-10-18 20:15
Well. Swagger is finicky

greg
2018-10-18 20:15
On way home. Love you

greg
2018-10-18 20:16
Wife isn?t in this channel. Sigh.

greg
2018-10-18 20:16
I love the community and all that. But

zehicle
2018-10-18 21:00
@dave.parker we're going to talk about the pool plugin for the community meeting - it allows you to checkout/allocate a machine and set those items and then return them when you are done. I posted a video about it earlier in the week

zehicle
2018-10-18 21:01
it's a RackN extension

s.pisarski
2018-10-18 21:30
@zehicle - I have scripted all of the DRP objects we believe is necessary for PXE booting but we have done something wrong as the image isn?t getting downloaded from the TFTP server. We are getting DHCP requests and leases? Any pointers would be most appreciated

zehicle
2018-10-18 21:33
1) did it work before you scripted it? 2) ramp up the logging levels gradually to see if there's a template or model error 3) look at the console of the machines to see how far they are getting in the network bootload sequence 4) is this physical or virtual systems? need to expand to @shane @vlowther and @greg to get help

zehicle
2018-10-18 21:33
@s.pisarski also, I'm assuming DRP is the TFTP server... is that right?

s.pisarski
2018-10-18 21:39
@zehicle - they are virtual servers right now and DRP should be the TFTP server

s.pisarski
2018-10-18 21:40
the machines are only getting their IP address but does not appear to attempt to download

s.pisarski
2018-10-18 21:41
and I haven?t tried it manually?

zehicle
2018-10-18 21:41
did you set the unknown workflow pref?

zehicle
2018-10-18 21:41
if the DRP DHCP is working but nothing is assigned, that's the typical reason

zehicle
2018-10-18 21:42
if that works, I'll make sure to update the FAQs

shane
2018-10-18 21:42
What does the Subnet configuration look like ?

s.pisarski
2018-10-18 21:43
I will assume that the ?unknown workflow? pref has not. I?ll do some more reading

s.pisarski
2018-10-18 21:44
@shane - Here?s a screenshot

shane
2018-10-18 21:45
You do need `defaultWorkflow` set - is the Virtual environment VirtualBox ?

zehicle
2018-10-18 21:45
@s.pisarski if you scripted the objects but did not include `drpcli prefs set ...` then that is likely the cause

s.pisarski
2018-10-18 21:47
@shane - we?re testing on OpenStack. Basically mocking baremetal

shane
2018-10-18 21:47
Cool - VirtualBox is a "special needs" environment ... and subnets need more helpers for vbox environment (which is why I asked)

s.pisarski
2018-10-18 21:47
nice

shane
2018-10-18 21:48
as @zehicle says - check you have all of your Prefs set (see the $quickstart for full details)

2018-10-18 21:48

s.pisarski
2018-10-18 21:48
I?ve turned off network security too

s.pisarski
2018-10-18 21:49
thanks @zehicle & @shane. I?ll ping ya if I have more educated questions :slightly_smiling_face:

s.pisarski
2018-10-18 21:57
@zehicle - we did indeed run the following command as you mentioned above ?drpcli prefs set defaultWorkflow discovery unknownBootEnv discovery defaultBootEnv sledgehammer defaultStage discover?

s.pisarski
2018-10-18 22:00
@zehicle - btw on an unrelated note. When creating machines immediately after creating the content pack, I have been getting exceptions thrown as it appears that there are some background processes still working on the content pack. I had to add 2 hooks to ensure the machines have been created. 1. sleep 10 seconds before requesting a machine

s.pisarski
2018-10-18 22:00
2. if that fails, I cleanup the content-pack and recreate and try again

s.pisarski
2018-10-18 22:01
it generally has been working after the first retry?

shane
2018-10-18 22:02
Are you doing a `drpcli bootenvs uploadiso ...` command in your scripting? That often takes some time to complete

shane
2018-10-18 22:02
until that completes the BootEnvs will not be marked Validated and Available for use

s.pisarski
2018-10-18 22:02
@shane - yes

s.pisarski
2018-10-18 22:02
and am blocking until the upload has completed

s.pisarski
2018-10-18 22:03
that is the first operation that is taking place after DRP has been installed

shane
2018-10-18 22:03
I don't recall if we do an "upload and return, then explode ISO" or an "upload, explode iso, then return"

shane
2018-10-18 22:03
if the ISO hasn't also been exploded out on disk - it'll still be unavailable

s.pisarski
2018-10-18 22:04
ah, that explains why this smells like a race condition

s.pisarski
2018-10-18 22:05
is there any way I can poll for this info?

s.pisarski
2018-10-18 22:05
I hate adding sleeps and retry on exception blocks

s.pisarski
2018-10-18 22:05
hooks

shane
2018-10-18 22:05
yes - you can check that the BootEnv returns Available and Validated both as True

s.pisarski
2018-10-18 22:06
cool, another backlog item :slightly_smiling_face:

shane
2018-10-18 22:06
there are also some hooks to wait in some of the command structures, until something is done

shane
2018-10-18 22:07
check `drpcli bootenvs wait ....` to have that block and wait until Available is true

s.pisarski
2018-10-18 22:12
@shane - I keep getting timeout returned from the following which has been up for awhile so it should return right away: drpcli bootenvs wait ubuntu-16.04-install Available true 10

s.pisarski
2018-10-18 22:12
figure I?m calling it incorrectly

shane
2018-10-18 22:12
I haven't used that construct (TBH) - so not sure other than it exists

shane
2018-10-18 22:13
it might be that we only check the condition if it changes, and it's not checked on initial start of the command?

shane
2018-10-18 22:13
we'd have to have @greg weigh in on how that `wait` construct works to be sure

s.pisarski
2018-10-18 22:13
so the syntax is correct as you see it?

shane
2018-10-18 22:14
well - according to the Help, yes

s.pisarski
2018-10-18 22:15
cool, that was what I was going off of but wasn?t sure about the field & values I placed in there were :slightly_smiling_face:

greg
2018-10-18 22:15
yeah - it seems correct, but I haven?t used it.

s.pisarski
2018-10-18 22:15
I guess I?ll just keep my hook in place until we complete our python API lib

shane
2018-10-18 22:16
my `sledgehammer` is `"Available": true`, so I'd expect this to work: ```drpcli bootenvs wait sledgehammer Available true 3``` but - it chucks a timeout as well

greg
2018-10-18 22:16
bug? I guess.

greg
2018-10-18 22:16
I?ll have to check it.

s.pisarski
2018-10-18 22:17
woot

chedbob.pm
2018-10-19 03:31
has joined #community201810

zehicle
2018-10-20 00:13
@chedbob.pm $welcome

2018-10-20 00:13
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

chedbob.pm
2018-10-20 00:26
Thanks Rob

slack1
2018-10-20 11:34
I?m having trouble getting DRP running on a Raspbian although the exact same steps produce a working application on a Debian VM using the same Ansible playbook. To rule out playbook problems I?ve done an isolated install and I get the following output every time I start the application ``` dr-provision2018/10/20 11:27:46.178552 Version: v3.11.0-0-8da90ceaf50f97c0d645c081f0384507e446424e dr-provision2018/10/20 11:27:46.180919 Extracting Default Assets dr-provision2018/10/20 11:27:46.181356 Extracting Default Assets dr-provision2018/10/20 11:27:52.082764 Starting metrics server dr-provision2018/10/20 11:27:53.486788 Starting TFTP server dr-provision2018/10/20 11:27:53.487708 Starting static file server dr-provision2018/10/20 11:27:53.488485 Starting DHCP server dr-provision2018/10/20 11:27:53.491895 Starting PXE/BINL server dr-provision2018/10/20 11:27:53.493434 Starting API server http: panic serving 127.0.0.1:52368: runtime error: invalid memory address or nil pointer dereference goroutine 235 [running]: net/http.(*conn).serve.func1(0x12c0c360) /home/travis/.gimme/versions/go1.10.linux.amd64/src/net/http/server.go:1726 +0x9c panic(0x869508, 0x1903c90) /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/panic.go:505 +0x204 sync/atomic.addUint64(0x12b76104, 0x1, 0x0, 0x1, 0x12edf940) /home/travis/.gimme/versions/go1.10.linux.amd64/src/sync/atomic/64bit_arm.go:31 +0x4c http://github.com/digitalrebar/provision/vendor/github.com/digitalrebar/logger.(*Buffer).NewGroup(0x12b76100, 0x12edf940, 0xbeeae22e) /home/travis/gopath/src/github.com/digitalrebar/provision/vendor/github.com/digitalrebar/logger/logger.go:252 +0x34 http://github.com/digitalrebar/provision/vendor/github.com/digitalrebar/logger.(*log).Fork(0x12e4f140, 0x0, 0x12af4) /home/travis/gopath/src/github.com/digitalrebar/provision/vendor/github.com/digitalrebar/logger/logger.go:596 +0x104 http://github.com/digitalrebar/provision/server.server.func1(0xa11ab0, 0x12da0240, 0x1) /home/travis/gopath/src/github.com/digitalrebar/provision/server/server.go:484 +0x38 net/http.(*conn).setState(0x12c0c360, 0xa11ab0, 0x12da0240, 0x1) /home/travis/.gimme/versions/go1.10.linux.amd64/src/net/http/server.go:1673 +0x94 net/http.(*conn).serve(0x12c0c360, 0xa0f3b0, 0x131f09e0) /home/travis/.gimme/versions/go1.10.linux.amd64/src/net/http/server.go:1771 +0xbc4 created by net/http.(*Server).Serve /home/travis/.gimme/versions/go1.10.linux.amd64/src/net/http/server.go:2795 +0x208 ``` I have received the same message on Raspbian 2018-10-09 and 2018-06-27. The last section repeats for API (1) through API (4)

shane
2018-10-20 15:21
@slack1 - I added in arch support to the install script a while ago - but it probably doesn't handle all `arch` cases correctly. On the device you're trying to install DRP on - can you please provide the output of the following: ``` uname -m uname -s```

shane
2018-10-20 15:22
It looks like you're getting an ARM build - but it may not be a correct build for your platform. Unfortunately - the ARM hardware ecosystem leaves a lot to be desired in this respect ... :disappointed:

shane
2018-10-20 15:22
It looks like the ARM build binaries we try to use aren't correct for your platform

shane
2018-10-20 15:23
(I forgot I had added in the ARM install hooks a while back ... :slightly_smiling_face: )

shane
2018-10-20 15:24
(also chuck a `uname -a` in there as well - plz)

slack1
2018-10-20 15:26
Thanks @shane, is issue #1003 the best place to follow progress on this? I did originally suspect it was an architecture issue but when I saw that ARMv7 bins had been installed I assumed this was not the case here. Here?s the information you requested: ``` uname -m armv7l uname -s Linux uname -a Linux raspberrypi 4.14.71-v7+ #1145 SMP Fri Sep 21 15:38:35 BST 2018 armv7l GNU/Linux ```

shane
2018-10-20 15:27
so we do build for `armv71` - and the install script supports that - but we don't test regularly with that architecture - we don't have any of that hardware

shane
2018-10-20 15:29
#1003 just references helper binaries for content use - not for the install or running of the `dr-provision` binary itself

shane
2018-10-20 15:31
we use Go's crosscompile support to target the `armv71` (Go refers to it as `arm_v7`) support - but we pretty much blindly rely on it "to do the right things"

shane
2018-10-20 15:31
others have tested the ARM builds on various hardware - just not sure about your case

slack1
2018-10-20 15:51
Okay I?ll give it a try on some older Pis I have lying around and failing that I?ll just build an Intel VM for DRP. Is this something I can help with or do we need to wait for the Go maintainers to release an update that fixes this cross compiling issue? Unfortunately I?m not a developer and I?ve never worked with Go so my basic coding skills are unlikely to help here:slightly_smiling_face:

zehicle
2018-10-20 16:28
@slack1 we don't have that hardware around for testing, so we're in the same boat

don
2018-10-22 04:30
has joined #community201810

b.quan
2018-10-22 04:47
We ran into the following error after connected to drp dhcp service, any idea what could be causing this issue?

zehicle
2018-10-22 04:48
Make sure that you have the right slidgehammer exploded

zehicle
2018-10-22 04:49
it's possible that the bootenv for your discovery image points to a different set of files

zehicle
2018-10-22 04:49
check the bootenvs to make sure they are validated / available

zehicle
2018-10-22 04:52
@don $welcome

2018-10-22 04:52
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

don
2018-10-22 04:57
Thanks rob

akl
2018-10-22 14:31
has joined #community201810

shane
2018-10-22 16:23
@akl $welcome

2018-10-22 16:23
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

dave.parker
2018-10-22 17:39
Still stuck with this python swagger API. I can make a patch of type JsonPatch and pass it to the patch_machine_params method (which claims it takes a machine uuid and a Json Patch object) but I get an attribute error.

dave.parker
2018-10-22 17:39
`AttributeError: 'JsonPatch' object has no attribute 'swagger_types'`

dave.parker
2018-10-22 17:40
It seems like it doesn't actually want a JsonPatch object, but some other type of object. Which I guess is why the docs say to do this and then pass that as the patch object `body = drp_swagger.Patch()`

dave.parker
2018-10-22 17:41
But I can't figure out how to either merge my patch and that object type, or get my patch into that object. Or... however it's supposed to work.

greg
2018-10-22 17:55
@dave.parker - you should try and get it working with curl - so you can understand what is really going on. My guess is that your swagger generator is doing esoteric things.

greg
2018-10-22 17:55

greg
2018-10-22 17:55
That gets the parameters from the machine named greg

greg
2018-10-22 17:55
I?ll continue building them here - it will take a minute.

greg
2018-10-22 17:56
This gets the parameter greg from the machine named greg `curl -k -u rocketskates:r0cketsk8ts https://127.0.0.1:8092/api/v3/machines/Name:greg/params/greg `

greg
2018-10-22 17:56
This will set without patch the parameter to a value `curl -k -u rocketskates:r0cketsk8ts https://127.0.0.1:8092/api/v3/machines/Name:greg/params/greg -H "Content-Type: application/json" -X POST -d'"fred"'`

greg
2018-10-22 17:57
This will remove a parameter without PATCH `curl -k -u rocketskates:r0cketsk8ts https://127.0.0.1:8092/api/v3/machines/Name:greg/params/greg -H "Content-Type: application/json" -X DELETE`

dave.parker
2018-10-22 18:06
I can do it with curl fine. I can also do it with the Python requests library. I just thought it would be easier to use the swagger client rather than essentially create my own API client in Python with the requests library. I'm suspecting that was very wrong though. :smile:

greg
2018-10-22 18:07
I have yet to find a swagger code generator that generates usable code.

greg
2018-10-22 18:08
For completeness, here is how to use patch on parameters of a machine.

greg
2018-10-22 18:08
`curl -k -u rocketskates:r0cketsk8ts https://127.0.0.1:8092/api/v3/machines/Name:greg/params -H "Content-Type: application/json" -X PATCH -d '[{"op": "test", "path": "/fred", "value": "GREG"},{"op": "replace", "path": "/fred", "value": "BED" }]'`

greg
2018-10-22 18:09
That will atomically replace the value in parameter `fred` with the value `BED` only if the value is currently `GREG`

dave.parker
2018-10-22 18:32
Cool.

dave.parker
2018-10-22 18:33
I'm going to abandon this route and just go back to requests, which was working. :smile:

dave.parker
2018-10-22 18:33
Thanks

zehicle
2018-10-22 19:00
@dave.parker @r.levensalor << both interested in Python API library

dave.parker
2018-10-22 19:01
I'm currently rolling my own since I'm unable to get the swagger codegen one to work.

r.levensalor
2018-10-22 19:02
@dave.parker we started a Python API library. https://github.com/cablelabs/drp-python It's only a partial implementation, but would be happy to collaborate on it.

dave.parker
2018-10-22 19:02
Oh cool.

dave.parker
2018-10-22 19:02
Let me look.

r.levensalor
2018-10-22 20:27
This is a follow-up from the issues that @s.pisarski was having. We have two almost identical systems. One we created manually and the other we scripted. On the one that we created manually, the SNAME and FNAME are set in the DHCPOFFER. With the instance that we scripted, the SNAME and FNAME are just set to . in the DHCPOFFER. Where is the SNAME and FNAME created by dr-provision and any thoughts on what is happening differently? Unfortunately, Steve had to head out for a while, so I may not be able to answer all of the questions about what was done on each system.

greg
2018-10-22 20:29
@r.levensalor - check to make sure the prefs are set correctly and the bootenvs are available.

r.levensalor
2018-10-22 20:32
The prefs look the same, but we are specifying the boot end and not using the default.

r.levensalor
2018-10-22 20:33
The boot ens and workflows look the same. On the server /var/lib/dr-provision/tftpbot are both 2.5GBs and look the same.

greg
2018-10-22 20:34
So, for the unknown bootenv what did you put there?

r.levensalor
2018-10-22 20:35
cat unknownBootEnv.json {"Meta":{},"Name":"unknownBootEnv","Val":"discovery"}

r.levensalor
2018-10-22 20:36
For the machines that we are booting, it is set to BootEnv":"ubuntu-16.04-install"

greg
2018-10-22 20:36
discovery is the normal one and it reports available.

greg
2018-10-22 20:36
hmm - ?? what do you mean for that last sentence?

r.levensalor
2018-10-22 20:37
This is the machine/id.json for the working system. ```cat da11ba2b-509e-47c0-b77a-5386a2a35f26.json {"Validated":false,"Available":false,"Errors":[],"ReadOnly":false,"Meta":{"color":"black","feature-flags":"change-stage-v2","icon":"server"},"Name":"test","Description":"test","Uuid":"da11ba2b-509e-47c0-b77a-5386a2a35f26","CurrentJob":"","Address":"10.1.0.11","Stage":"ubuntu-16.04-install","BootEnv":"ubuntu-16.04-install","Profiles":[],"Profile":{"Validated":false,"Available":false,"Errors":null,"ReadOnly":false,"Meta":null,"Name":"","Description":"","Documentation":"","Params":null},"Params":{},"Tasks":["stage:ubuntu-16.04-install","bootenv:ubuntu-16.04-install","ubuntu-drp-only-repos","ssh-access","stage:complete","bootenv:local"],"CurrentTask":-1,"Runnable":true,"Secret":"_JNdSWE3TcyjvV3L","OS":"ubuntu-16.04","HardwareAddrs":[],"Workflow":"clone-ubuntu16"}``` This is the one for the none working system. ```{"Validated":false,"Available":false,"Errors":[],"ReadOnly":false,"Meta":{"color":"black","feature-flags":"change-stage-v2","icon":"server"},"Name":"test","Description":"test","Uuid":"6925e6b3-eb4f-4b90-82ca-6e64bf3c2bff","CurrentJob":"","Address":"10.1.0.11","Stage":"ubuntu-16.04-install","BootEnv":"ubuntu-16.04-install","Profiles":[],"Profile":{"Validated":false,"Available":false,"Errors":null,"ReadOnly":false,"Meta":null,"Name":"","Description":"","Documentation":"","Params":null},"Params":{"rs-debug-enable":true},"Tasks":["stage:ubuntu-16.04-install","bootenv:ubuntu-16.04-install","ubuntu-drp-only-repos","ssh-access","stage:complete","bootenv:local"],"CurrentTask":-1,"Runnable":true,"Secret":"ANzUhIrfREq--UUG","OS":"ubuntu-16.04","HardwareAddrs":[],"Workflow":"clone-ubuntu16"}```

r.levensalor
2018-10-22 20:37
We don't want to use discovery

greg
2018-10-22 20:39
okay - so is the ubuntu-16.04-install bootenv available?

greg
2018-10-22 20:39
Does the machine have internet access?

r.levensalor
2018-10-22 20:39
As far as I can tell.

r.levensalor
2018-10-22 20:40
I can reach the internet from that machine, but it is not accessible from the internet.

r.levensalor
2018-10-22 20:42
`drpcli bootenvs exists ubuntu-16.04-install` returns 0 on both systems.]

greg
2018-10-22 20:43
`drpcli bootenvs show ubuntu-16.04-install | jq .Available`

r.levensalor
2018-10-22 20:43
returns `true`

greg
2018-10-22 20:44
check your subnet or reservations for bootfile overrides?

greg
2018-10-22 20:44
Don?t know.

greg
2018-10-22 20:44
Would need to see your automation

r.levensalor
2018-10-22 20:46
This is the subnet that works. ```{"Validated":false,"Available":false,"Errors":[],"ReadOnly":false,"Meta":{},"Name":"local_subnet","Description":"","Documentation":"","Enabled":true,"Proxy":false,"Unmanaged":false,"Subnet":"10.1.0.0/24","NextServer":"","ActiveStart":"10.1.0.2","ActiveEnd":"10.1.0.254","ActiveLeaseTime":60,"ReservedLeaseTime":7200,"OnlyReservations":false,"Options":[{"Code":3,"Value":"10.1.0.1"},{"Code":6,"Value":"8.8.8.8"},{"Code":15,"Value":"http://example.com"},{"Code":1,"Value":"255.255.255.0"},{"Code":28,"Value":"10.1.0.255"}],"Strategy":"MAC","Pickers":["hint","nextFree","mostExpired"]}``` This is the subnet that doesn't work. ```{"Validated":false,"Available":false,"Errors":[],"ReadOnly":false,"Meta":{},"Name":"Managment_SUBNET","Description":"management","Documentation":"","Enabled":true,"Proxy":false,"Unmanaged":true,"Subnet":"10.1.0.0/24","NextServer":"","ActiveStart":"10.1.0.2","ActiveEnd":"10.1.0.254","ActiveLeaseTime":60,"ReservedLeaseTime":7200,"OnlyReservations":false,"Options":[{"Code":1,"Value":"255.255.255.0"},{"Code":3,"Value":"10.1.0.1"},{"Code":6,"Value":"8.8.8.8"},{"Code":15,"Value":"http://example.com"},{"Code":28,"Value":"10.1.0.255"}],"Strategy":"MAC","Pickers":["hint"]}``` We have have reservations for the IP on both systems.

greg
2018-10-22 20:47
In the second subnet, why do you have unmanaged set to true and the pickers set to only hint?

greg
2018-10-22 20:48
``` // Unmanaged indicates that dr-provision will never send // boot-related options to machines that get leases from this // subnet. If false, dr-provision will send whatever boot-related // options it would normally send. It is an error for Unmanaged and // Proxy to both be true. // // required: true Unmanaged bool ```

greg
2018-10-22 20:49
You need to set unmanaged to false if you want bootfile sent.

greg
2018-10-22 20:50
Pickers should be all three values as well.

greg
2018-10-22 20:50
The reservation will save you there.

r.levensalor
2018-10-22 20:53
@greg Excellent. Are there different defaults from the portal and the api? We don't currently specify pickers or unmanaged with either case.

greg
2018-10-22 20:54
Yes - possibly.

greg
2018-10-22 20:55
The UX attempts to do some additional math on setting the values. Unmanaged should have been set to false by default if left blank. Pickers is probably full set in the UX by default.

r.levensalor
2018-10-22 21:20
One more question: Why doesn't specifying --ipaddr add `--static-ip=10.197.133.151 --force-static` to /etc/systemd/system/dr-provision.service? Is there a way to do that from the install.sh script or will we just need to update the dr-provision.service file on our own? We are running dr-provision on a SNAT with a dhcp-relay. Changing the service file works like a charm.

greg
2018-10-22 21:21
Feature request. :slightly_smiling_face: In general, we don?t have many people doing SNAT with a dhcp-relay using the install.sh script. :slightly_smiling_face:

greg
2018-10-22 21:22
We also don?t really want people using the static-ip. it gets them in trouble more often than not. In your case, it makes sense.

greg
2018-10-22 21:22
The feature request is good though.

greg
2018-10-22 21:23
Well and also the --ipaddr flag is only used for isolated installs and the output message.

greg
2018-10-22 21:24
porbably have to do it by ?hand? for now.

r.levensalor
2018-10-22 21:38
@greg Thanks

r.levensalor
2018-10-22 21:46
If you are looking for more feature requests. Exposing the values on unmanaged and Pickers in the ux could be helpful.

greg
2018-10-22 23:03
- sledgehammer update in tip again. Sorry. More tweaks. This one to handle networks with portdelays.

greg
2018-10-22 23:04
adding `provisioner.portdelay=10` to `kernel-console` parameter will cause the stage1 sledgehammer to delay 10 seconds after marking links up before running dhcp or doing anything else. `10 ` is an example can be other integers.

shane
2018-10-22 23:38
- reminder on Digital Rebar Provision online meetup - Tuesday at 11am PST (tomorrow). Details: https://www.meetup.com/digitalrebar/events/lchdhpyxnbfc/


zehicle
2018-10-23 22:48

b.quan
2018-10-24 04:43
Rob, not sure what you meant by "have the right slidgehammer exploded". The bootenv I created from default ubuntu-16.04 points to the hwe kernel, with the only difference from the default one is: ... "Initrds": [ "install/hwe-netboot/ubuntu-installer/amd64/initrd.gz" ], "Kernel": "install/hwe-netboot/ubuntu-installer/amd64/linux", ... The bootenv I created shows it as both available and validated. Why is it trying to connect to 10.11.148.4 but fails to connect?

b.quan
2018-10-24 15:57
@zehicle ^^^

zehicle
2018-10-24 16:23
@r.levensalor clarified that you are trying to skip sledgehammer

zehicle
2018-10-24 16:23
Perhaps, look at the kexec to boot sledge then jump to install

r.levensalor
2018-10-24 16:25
Correct. We are just wanting to boot and install with the Ubuntu 16.04 HWE kernel. Would running sledgehammer first load the correct kernel for installation?

shane
2018-10-24 16:27
@shane set the channel topic: Just say NO to Threads!!

greg
2018-10-24 16:30
It could. What is that 10.... address? Is it drp?

zehicle
2018-10-24 16:47
@r.levensalor @b.quan the benefit of going to sledge first and then kexec to ubuntu install would be that you'd leverage all the default discovery workflows as a starting point and then could immediately jump to ubuntu in the workflow w/o reboot

zehicle
2018-10-24 16:48
does not require reposting the machine.... we did a video about this a while back, can't kexec right into Ubuntu but can into the net installer.

r.levensalor
2018-10-24 16:53
@greg 10.11.148.4 is the local and unreachable IP for the drp server. We are running with ExecStart=/usr/local/bin/dr-provision --static-ip=10.197.133.157 --force-static. So it should be trying to download from 10.197.133.157 and not 10.11.148.4.

greg
2018-10-24 16:55
Okay. Thanks. Sorry for the redundant questions. What is specifically failing? I need to walk the code. Is it the kernel line in the pxe file?

greg
2018-10-24 16:56
Or is it in the ubuntu installer already?

r.levensalor
2018-10-24 16:59
I need a minute to double check some stuff. I had a copy / paste error, and --force-stati was missing the last c. I'm restarting now and I'll trace to the code to where it is failing. Thanks!

b.quan
2018-10-24 17:00
@greg It was running sledgehammer and obtained the correct IP 10.197.x.x from dhcp server, then failed to connect to 10.11.148.4 (not sure where this IP comes from) to obtain stage2.img

r.levensalor
2018-10-24 17:02
@b.quan @greg the issue was just my type-o with --force-static. It worked this time

greg
2018-10-24 17:03
:sunglasses:

r.levensalor
2018-10-24 17:03
We'll script that soon, to avoid my fat fingers.

b.quan
2018-10-24 17:03
cool @r.levensalor

digital.rebar.slack
2018-10-26 07:46
Hey guys, I am trying the tip to PXE boot arm64. I have changed no defaults and DRP is not the DHCP server. When I upload the sledgehammer bootenv it correctly creates two isos however when I boot my arm machine it pick the x86 version and not the arm one. How does it decide which one to use?

greg
2018-10-26 13:23
Since DRP is not the DHCP server, you will need to force it use the correct boot file.

greg
2018-10-26 13:24
Actually, `default.ipxe` as the bootfile might work.

greg
2018-10-26 13:25
if not, you could try `arm64.ipxe` as the boot file.

greg
2018-10-26 13:26
or if you are booting grub, you could try: `grub/arm64.cfg`

greg
2018-10-26 13:26
@digital.rebar.slack

digital.rebar.slack
2018-10-26 13:28
I'm currently only as far as discovery...

greg
2018-10-26 13:30
That is what I mean, the DHCP server needs to send one of those as the bootfile.

digital.rebar.slack
2018-10-26 13:36
...ah, ok. Cheers, I'll give that a try when I get home!

christopher_wood
2018-10-26 21:28
has joined #community201810

christopher_wood
2018-10-26 21:36
Good afternoon. There the documentation section which says "Supports ALL orchestration tools including Chef, Puppet, Ansible, SaltStack, Bosh, Terraform, etc",

christopher_wood
2018-10-26 21:36
... comma, space... where's the documentation bit which describes how to do the integration there?

zehicle
2018-10-26 21:58
Which ones are you looking for?

christopher_wood
2018-10-26 22:00
Puppet in my case.

zehicle
2018-10-26 22:02
Nothing prewired in the community, people just build that as a stage in thier workflow. Check out http://github.com/digitalrebar/colordemo to see how to add a stage.

christopher_wood
2018-10-26 22:03
looking

christopher_wood
2018-10-26 22:08
Okay, I think I get it? Stage has one or more tasks, which template the specified file and then run the job. Shell script in this case. I'll have to read this over more carefully.

zehicle
2018-10-27 01:23
The challenge w making community stages for agents is that you need the server to test/verify because they are tightly coupled.

patrick.miller
2018-10-27 17:28
is there a centos 6 bootenv available somewhere, or should I just create one?

greg
2018-10-27 17:33
In the contrib content pack, I think.

greg
2018-10-27 17:34
@patrick.miller - drp-community-contrib content pack.

digital.rebar.slack
2018-10-27 18:02
@greg I have now switched to using DRP as the DHCP server. After loading the stage1.img and vmlinuz0 files I get "Bad Linux ARM64 Image magic!"... any ideas

chedbob.pm
2018-10-27 18:24
thanks greg

greg
2018-10-27 18:56
Maybe not right arm type

seaton
2018-10-28 07:29
has joined #community201810

seaton
2018-10-28 07:38
Hi all, sorry for the noob question, but whats the best practice for using DRP per site/DC, i.e. should a single instance reside on a backend ?management? vlan with each managed device having a leg in the same subnet, or should there be an instance of DRP on the frontend subnet vlan?

zehicle
2018-10-28 11:45
Either is ok. What are you trying to optimize?

zehicle
2018-10-28 22:33
FYI - this is big news IBM to Buy Red Hat, the Top Linux Distributor, for $34 Billion https://nyti.ms/2COnWSf

zehicle
2018-10-28 22:37
@seaton to be more descriptive... DRP supports either model. It's more of a style choice depending on how you manage your network traffic and management VLANs. We see lots of variation so there's no right way. It depends more on your scale, security approach and need to shape traffic on the management network based on provisioning frequency. DRP can easily scale to handle significant workload (especially if you offload image and repos to dedicated data stores). On the other hand, the content APIs make it easy to manage distributed endpoints so having a lot of smaller endpoints is not a design problem. I hope that helps! and $welcome

2018-10-28 22:37
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

seaton
2018-10-29 01:11
thanks @zehicle We?re only a small site with a couple of small data centres. I just fired up DRP last night and must say has been well though out and easy to use.

seaton
2018-10-29 01:15
thought*

daniel.bernier
2018-10-29 14:28
coreos

christopher_wood
2018-10-29 15:55

christopher_wood
2018-10-29 15:55
Ending in Rw54Bt9.

greg
2018-10-29 15:57
Maybe- it is an example profile that is not applied to anything by default.

greg
2018-10-29 15:57
@christopher_wood - it is intended as a specific example that isn?t used.

christopher_wood
2018-10-29 15:58
No worries, as long as it wasn't a horrible surprise.

greg
2018-10-29 16:00
well - you know. Probably not the best place for a public key, but once it was committed to github once. I figured , meh. At least it isn?t in the global profile by default for all of you to inherit. :slightly_smiling_face: I think that happened once. A long time ago and why I?m not to concerned about it now. As long as I can?t log into your systems by default

christopher_wood
2018-10-29 16:01
If you find you can, say hi to our security team and cc me. :wink:

greg
2018-10-29 16:01
:slightly_smiling_face:

zehicle
2018-10-29 16:17
@daniel.bernier playing Alex Trebek.... can you phrase that as a question?

zehicle
2018-10-29 16:18
always assumed that it was Greg's attempt to take over the world

daniel.bernier
2018-10-29 16:41
@zehicle good one

greg
2018-10-29 20:33
- Warning - tip has issues with plugins and events that can lock up the system. This was injected two weeks ago. This is mostly an issue with the Packet plugin, but others could be effected.

greg
2018-10-29 20:33
I?m looking at fixing it now.

daniel.bernier
2018-10-30 03:41
@zehicle question would have been ? any new successful attempt at supporting coreos ?

zehicle
2018-10-30 04:03
@daniel.bernier we (RackN) have not been focused on it but I'd suggest looking back to Sept 13 in this channel where Greg talks about it. https://rackn.slack.com/archives/C02L9P26Q/p1536875626000100 Also, @tgelter, do you have updates based on your work w/ CoreOS?

b.quan
2018-10-30 04:35
If I want to use a custom post-install template during deployment, is there an easy way to specify the template as a param (just like select-kickseed param) or do I have to define a custom bootenv in order to do that?

shane
2018-10-30 04:53
@b.quan - what are you trying to accomplish ?

shane
2018-10-30 04:53
any stage you place in a workflow, after the (for example) `ubuntu-18.04-install` stage will be executed as a post-install action

shane
2018-10-30 04:54
this is true of any of the DRP community content, contrib, and rackn bootenvs that utilize the stock templates

shane
2018-10-30 04:58
Here's an example of an Ubuntu install that executes 2 post-install stages - `packet-ssh-keys` (installs SSH keys from the Packet meta data service), and `runner-service` (makes the DRP agent resident after install for advanced lifecycle management), then marks the Machine as completed: `ubuntu-18.04-install --> packet-ssh-keys --> runner-service --> complete-nowait`

b.quan
2018-10-30 05:24
@shane It helps. Thanks for pointing me to the direction of using stages in workflow to perform different kinds of post-install actions!

shane
2018-10-30 05:29
You can see an example of custom content in the color demo repo on GitHub it shows you example content for creating stages tasks and templates

b.quan
2018-10-30 06:29
cool, thanks @shane

seaton
2018-10-30 10:23
Im having problems with the krib demo in that the installed yum mirror is timing out when trying to access http://mirrors.edge.kernel.org : http://mirrors.edge.kernel.org/centos/7/extras/x86_64/Packages/docker-1.13.1-75.git8633870.el7.centos.x86_64.rpm: [Errno 12] Timeout on http://mirrors.edge.kernel.org/centos/7/extras/x86_64/Packages/docker-1.13.1-75.git8633870.el7.centos.x86_64.rpm: (28, ?Operation too slow. Less than 1000 bytes/sec transferred the last 30 seconds?) Trying other mirror.

seaton
2018-10-30 10:25
can the image be updated to find closest mirror, similar to normal centos repo?

seaton
2018-10-30 10:41
it only retries the same mirror until it fails and then task fails

seaton
2018-10-30 11:02
ok I think I have it solved, just retrying now, I created a new profile based on the repository template and updated the repositories so are pointing to local mirrors and assigned to the test cluster. Will know if its worked shortly

seaton
2018-10-30 11:16
nope didn?t work, still using http://mirrors.edge.kernel.org :disappointed:

bagricola
2018-10-30 11:31
Q: is there any good reason to not use the ?files? functionality to upload e.g. a switch image binary? It?s a lazy way of hosting cumulus images to install switches but wondering if there?s a downside (e.g. if all the support files are synced onto newly provisioned hosts when they boot then having a 200M switch image in there would probably be bad)

seaton
2018-10-30 12:17
further to my krib mirror issues when installing docker, if I use krib-live workflow then the repos from sledgehammer are pointing to the fastest mirror and is local to my country.

zehicle
2018-10-30 13:10
@bagricola it's fine to use. If you are pushing lots of big images around in parallel, then it could become a bottleneck

bagricola
2018-10-30 13:11
Cool :+1: It?s probably only going to be <latest image>, DRP is dhcp server on the management VLAN so when the switches do first-boot they get an IP & reservation from DRP, seems to make sense to put the image there for initial deployment

zehicle
2018-10-30 13:12
That feature is there for that purpose

bagricola
2018-10-30 13:13
do you have anyone running drpcli on cumulus switches yet? :smile:

bagricola
2018-10-30 13:13
i have some x86 based 100G switches here :slightly_smiling_face: they obv dont pxe boot so can?t install the runner by default, but?

bagricola
2018-10-30 13:14
they do have ?zero touch provisioning? which can run a script through a DHCP option

zehicle
2018-10-30 13:15
Not yet. Excited to track your progress

bagricola
2018-10-30 13:15
ha. I?ll give it a shot :slightly_smiling_face: already blown these away and had to reimage a couple of times lol

zehicle
2018-10-30 13:27
@seaton so, it works from the live workflow?

zehicle
2018-10-30 13:27
Or, it does not work in either case?

greg
2018-10-30 13:44
- tip should now have the deadlock fix. Please move to the latest tip if you are tip. :wink:

christian.tardif
2018-10-30 13:56
Quite interested about this discussion. I'm working with @daniel.bernier, trying to find a decent CoreOS deployer. So far, success has been limited with our "current" deployer (MaaS)

zehicle
2018-10-30 14:07
@christian.tardif can you describe what you want it to do? That would help me frame an helpful answer

christian.tardif
2018-10-30 14:54
Actually, the actual process of CoreOS installation isn't really difficult. What they provide, out-of-the-box, is an ISO from which you spin up an application that need to have very simple information: where to install, what image version (by default, latest stable), and an ignition file to customize the installation on first boot. This ISO has a big problem: it does not support UEFI (!!!!). So basically, the only viable solution is to replace the ISO installation by a custom deployer that would do the same: what, where, how. The "hack" we did in MaaS actually starts Ubuntu 18.04 (well, this is how MaaS works) then push the image that we provided to the server on the target drive, along with basic networking, partitionning, and few host parameters (like hostname). I don't think we should got further than that. From CoreOS point of view, I want to be able to get the most of the ignition file (most probably with your template stuff). For anything more fancy, that should be up to another tool (Ansible for example) to complete. That said, the idea with CoreOS is to avoid uneccessary over-configuration: we want to deploy, use, then replace. CoreOS creates many partitions, one of them being the ROOT partition for persistence between OS upgrades.

zehicle
2018-10-30 14:55
Is it possible to boot CoreOS with kexec? If so, you could boot into sledgehammer and then kexec into CoreOS.

zehicle
2018-10-30 14:56
FWIW - we do exactly the same pattern using Sledgehammer as the ephemeral OS but it's based on Cent7 and easier to extend/manage using normal tools

zehicle
2018-10-30 14:57
@christian.tardif we typically just have discussions like this in the primary channel and avoid threads

christian.tardif
2018-10-30 14:58
@zehicle Understood :slightly_smiling_face:

christian.tardif
2018-10-30 15:00
@zehicle As I understood, that's the custom images support in DRP, right?

zehicle
2018-10-30 15:01
Sledgehammer is the base discovery OS for DRP, it's part of the core system

zehicle
2018-10-30 15:02
custom images is to take existing O/S images (like from Packer) and write them to disk then boot

christian.tardif
2018-10-30 15:07
OK, so that's how CoreOS is installing (from images). AFAIK, there's no "installation" per say process for CoreOS. They're in the "deployment" process. SO then, that would be a matter of telling DRP where to deploy the image, building the ignition file to include, and making sure GRUB points to the right place so the machine can eventually reboot.... depending on what this means for DRP

zehicle
2018-10-30 16:12
if you are building a boot from disk CoreOS, yes. That's what the RackN image builder workflows do. if you are net installing CoreOS then you could hand off the the net install via kexec as in this video: https://www.youtube.com/watch?v=Xm688Km3N4Y

zehicle
2018-10-30 16:13
We show it for Ubuntu & Centos. I have no idea if it would work for CoreOS the same way

zehicle
2018-10-30 16:14
@christian.tardif also, DRP can provide the cloud-init via a DRP dynamic template driven from params/profiles

christian.tardif
2018-10-30 16:57
Will check the video. Meanwhile, CoreOS (now RedHat) is dropping support for cloud-init soon. Only ignition will last. That said, it's more or less the same thing. The main difference between the two is how and when this happens in the boot process. ignition happens very early in the boot process, theorically being able to do much more, as it happens while the boot process is still in the ramfs space.

zdunn
2018-10-30 19:42
has joined #community201810

shane
2018-10-30 20:02
@zdunn @welcome :slightly_smiling_face:

zdunn
2018-10-30 20:06
hello!

shane
2018-10-30 20:27
that was supposed to be $welcome ...

2018-10-30 20:27
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

zdunn
2018-10-30 20:28
no worries

zehicle
2018-10-30 20:33
we just merged some interesting prototype work about using DRP workflow & runner in cloud instances WITHOUT provisioning the machine first. This is a potentially handy way to build workflows without needing any metal at all AND saving the boot times.

zehicle
2018-10-30 20:33

zehicle
2018-10-30 21:59
I've verified that the same code works the same way on Google Cloud too

zehicle
2018-10-30 22:46
which is super handy because you can just "reset" the machines and delete and they come right back

seaton
2018-10-31 02:58
@zehicle krib live is using fastest mirrors and looking at the repos on the boot hammer image looks like stock standard centos repos and this is getting me through the next steps but is now failing on etcd config task with some 403 errors so will do some more reading on the configs available to me first before coming back here.

shane
2018-10-31 03:00
@seaton - I believe I wired in the ability to change the repo locations for the Etcd/Kubernetes pieces. You should be able to specify a param if there is a a GEO located mirror that is better suited for your location


seaton
2018-10-31 03:03
thanks @shane still getting my head around how it all works, I did try but it didn?t work, still playing so will work it out :slightly_smiling_face:

shane
2018-10-31 03:04
looks like we still hard code the Kubernetes repo - still digging for the Etcd pieces... https://github.com/digitalrebar/provision-content/blob/master/krib/templates/kubernetes-install.sh.tmpl

seaton
2018-10-31 03:05
i think its around the centos repos that get injected in on the centos install of crib. they are hardcoded in

seaton
2018-10-31 03:06
so docker install task times out and fails

shane
2018-10-31 03:06
`package-repositories` param is defined to allow you to override the package repository mirror locations https://provision.readthedocs.io/en/tip/doc/arch/data.html?highlight=package-repositories#the-package-repositories-param

seaton
2018-10-31 03:07
my etcd-config task failing I think may be down to my config I?m using (or not using)

shane
2018-10-31 03:08
the docker piece is installed from standard repos - by default it is US centric repos - but the `package-repositories` allow you to override the repo mirror - which also should make docker/etcd install use those local mirrors you define

seaton
2018-10-31 03:11
Thanks @shane I had found that and I thought I had overridden it with more local repos, so I?m just working out what I?ve done wrong there, This si the new profile I created call OzMirrors that contains the property ?mirror-package-repositories? in this I have defined the following array ?installSource?: true, ?os?: [ ?centos-7? ], ?tag?: ?centos-7?, ?url?: ?http://mirror.optus.net/centos/7/os/x86_64/? }, { ?installSource?: true, ?os?: [ ?centos-7.5.1804" ], ?tag?: ?centos-7.5.1804", ?url?: ?http://mirror.optus.net/centos/7.5.1804/os/x86_64/? }, { ?components?: [ ?atomic?, ?centosplus?, ?cr?, ?dotnet?, ?extras?, ?fasttrack?, ?os?, ?rt?, ?updates? ], ?distribution?: ?7", ?os?: [ ?centos-7", ?centos-7.3.1611?, ?centos-7.4.1708", ?centos-7.5.1804? ], ?tag?: ?centos-7-everything?, ?url?: ?http://mirror.optus.net/centos? }, { ?distribution?: ?7?, ?os?: [ ?centos-7?, ?centos-7.3.1611", ?centos-7.4.1708?, ?centos-7.5.1804" ], ?tag?: ?epel-7", ?url?: ?http://mirror.optus.net/fedora-epel? } ]

seaton
2018-10-31 03:12
This profile has then been applied to the machines in question using bulk actions. Is this correct way of doing it?

shane
2018-10-31 03:12
(please drop code like that in to a "snippet" - use the Plus "+" button to snippet-ize it :slightly_smiling_face: )

seaton
2018-10-31 03:12
ahh sorry

shane
2018-10-31 03:13
yes - that's the correct procedure in general - the content of your Param may / may not be correct

seaton
2018-10-31 03:13
much better my slack fu is not strong

shane
2018-10-31 03:13
thank you :slightly_smiling_face: much better

seaton
2018-10-31 03:13
cheers that?s enough to point me.

shane
2018-10-31 03:14
It looks right on first blush - but I haven't played w/ the package-repo param much so not 100% sure on that

shane
2018-10-31 03:15
In the snippet you can also set the content type - eg JSON - and we get prettyprint colorized output

seaton
2018-10-31 03:16
so from reading the docs around krib, for the demo really all I should do is clone the example-krib parameter to a new name and fill in the two parameters to the same as the profile name? or do I need some addition config?

shane
2018-10-31 03:16
Obviously - but just in case, I'll say it - setting the param needs completed prior to doing the OS install to set the Repos at install time correctly

seaton
2018-10-31 03:16
yes understood

shane
2018-10-31 03:17
Yes - basically for a simplified demo install that's all that's needed - should just be the Etcd and KRIB cluster profile name params

seaton
2018-10-31 03:20
cool @shane ok just want to check I?m heading in the right direction.

seaton
2018-10-31 03:24
just testing the `package-repositories` param now as I had `mirror-package-repositories` set

b.quan
2018-10-31 03:38
Any insight in the following error that I'm running into (the IP address in the URL is the DRP server IP)?

zehicle
2018-10-31 03:39
@b.quan what bootenv are you using?

b.quan
2018-10-31 03:40
@zehicle ubuntu-16.04-install stock bootenv

seaton
2018-10-31 03:43
@shane mirror override is working using `package-repositories` param thanks

greg
2018-10-31 03:46
@b.quan - is the ip address right? for your networking?

b.quan
2018-10-31 03:46
yes

greg
2018-10-31 03:47
Can you curl the file? Is the machine still in the bootenv for ubuntu? Is that the UUID for that machine?

b.quan
2018-10-31 03:54
curl does not err out, but returns nothing

b.quan
2018-10-31 03:54

greg
2018-10-31 03:55
is the bootevn avaliable?

greg
2018-10-31 03:55
and valid?

b.quan
2018-10-31 03:59
Yes, both available and validated. It's the stock bootenv one for ubuntu-16.04


zehicle
2018-10-31 04:03
sort of old...

zehicle
2018-10-31 04:05
This is suggesting that the md5sum could be out of date for the ISO:


zehicle
2018-10-31 04:05
@b.quan please check the md5sums... it's possible that the community bootenv checksum is out of date compared to the latest ISOs

b.quan
2018-10-31 04:13
@zehicle I did not see checksum stored as a bootenv attribute, how do I get the community bootenv checksum for comparison?

zehicle
2018-10-31 04:16
It's in .IsoSha256 : c94de1cc2e10160f325eb54638a5b5aa38f181d60ee33dae9578d96d932ee5f8

zehicle
2018-10-31 04:16
it looks OK if you are using 16.04.5


b.quan
2018-10-31 04:20
Here it is: "IsoFile": "ubuntu-16.04.5-server-amd64.iso", "IsoSha256": "c94de1cc2e10160f325eb54638a5b5aa38f181d60ee33dae9578d96d932ee5f8", "IsoUrl": "http://mirror.math.princeton.edu/pub/ubuntu-iso/16.04/ubuntu-16.04.5-server-amd64.iso",

greg
2018-10-31 04:37
@b.quan - `drpcli machines show 884a7803-ee39-42bd-8fed-95002fb52656`

greg
2018-10-31 04:37
What are the workflow, stage, and bootenv for that machine?

b.quan
2018-10-31 04:39
@greg thanks. I'll get the info to you tomorrow.

seaton
2018-10-31 04:49
still working my way through the krib demo and finding things are borking at the etc-config stage. docker is now installing form local mirror like a trooper, however etc-config always fails at the same point so I feel its something I?m doing (or not doing)

seaton
2018-10-31 04:50
same message on all hosts

greg
2018-10-31 05:01
My guess is that you don?t have the krib profile setup correctly.

seaton
2018-10-31 08:57
@greg you?re correct I foudn a typo where the config did not match the profile name, I corrected and all is working as it should :smile: Thanks

seaton
2018-10-31 08:59
well that is it gets past etc-config

seaton
2018-10-31 09:03
I?m now getting some preflight errors at the `krib-config` stage. and what looks like component version errors

greg
2018-10-31 11:22
Make sure the clocks in your machines are synced

seaton
2018-10-31 11:40
@greg thanks, clocks are synced and diff between all three in the cluster would be less than 1 sec between them from what I can see.

zdunn
2018-10-31 12:50
is anyone doing anything with ZFS in their images?

zdunn
2018-10-31 12:50
we are looking to either have a step to build the zfs libs into an ubuntu base or maybe push an image with the zfs libs prebuilt

zdunn
2018-10-31 12:50
just wondering if anyone had run into this before

shane
2018-10-31 13:44
@seaton - this safety check is good - it indicates you have older installed Kubelet versus the control plane containers that were pulled in

shane
2018-10-31 13:44
did you use a custom `kubeadm.cfg` ?

shane
2018-10-31 13:45
you could attempt to fix it by setting `krib/cluster-kubernetes-version` to the value `v1.12.2` to force it pull in the newer version

darcher
2018-10-31 16:38
has joined #community201810

zdunn
2018-10-31 17:02
any tricks to getting the sledgehammer live image to capture LLDP ?

zehicle
2018-10-31 17:06
we have a stage for that....

smedefind
2018-10-31 17:07
We?re using the stage but it just gets `{}`

smedefind
2018-10-31 17:07
trying to debug it on the image now

shane
2018-10-31 17:08
@darcher $welcome

2018-10-31 17:08
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

zehicle
2018-10-31 17:08
@zdunn @smedefind you need to turn on LLDP for the switch (or VM host if you are using VMs)

smedefind
2018-10-31 17:08
LLDP seem to be working on the switch, I can see neighbors etc

zehicle
2018-10-31 17:09
the LLDP on the machine is acting as a client for the switch LLDP service

zehicle
2018-10-31 17:10
the stage uses pretty basic dump LLDP output, so you may need to troubleshoot on a console of the machine(s)

smedefind
2018-10-31 17:10
yeah, that?s where we are now

shane
2018-10-31 17:10
@zdunn are your Machines you want LLDP client packets from bare metal, or VMs in a hypervisor ?

zdunn
2018-10-31 17:11
baremetal

zehicle
2018-10-31 17:11
let me make sure I understand... you want to read LLDP data about the machines from the switch or vice versa?

smedefind
2018-10-31 17:12
We want to pull the switch data down

smedefind
2018-10-31 17:12
But also it?s nice if the switch would also see it

zdunn
2018-10-31 17:12
so we can map metal to ports

zdunn
2018-10-31 17:12
true!

zdunn
2018-10-31 17:12
but baby steps

zdunn
2018-10-31 17:13
@zehicle this works on other machines in our fleet

smedefind
2018-10-31 17:14
```[lldpcli] # show neighbors ------------------------------------------------------------------------------- LLDP neighbors: ------------------------------------------------------------------------------- Interface: ens1f0, via: LLDP, RID: 1, Time: 0 day, 21:46:06 Chassis: ChassisID: mac 00:1c:73:af:33:ff SysName: rack04-tor-01 SysDescr: Arista Networks EOS version 4.18.10M running on an Arista Networks DCS-7050S-64```

zdunn
2018-10-31 17:14
bah! do you even snippet brah?

smedefind
2018-10-31 17:14
Slack failed

zehicle
2018-10-31 17:15
what's the JSON output? `lldpctl -f json `

smedefind
2018-10-31 17:15
```

smedefind
2018-10-31 17:15
huh, what is with slack

smedefind
2018-10-31 17:16
```{ "lldp": { "interface": [ { "ens1f0": { "via": "LLDP", "rid": "1", "age": "0 day, 21:50:05", "chassis": { "rack04-tor-01": { "id": { "type": "mac", "value": "00:1c:73:af:33:ff" }, "descr": "Arista Networks EOS version 4.18.10M running on an Arista Networks DCS-7050S-64",```

zehicle
2018-10-31 17:16
also, which version are you using? Tip sledgehammer has been making some changes to enable multi-arch and IPv6

smedefind
2018-10-31 17:16
Whatever latest stable is

zdunn
2018-10-31 17:16
@smedefind the last json snippet is NOT from a sledgehammer machine correct?

shane
2018-10-31 17:16
Well you get what you pay for ... and Slack is paying to host on AWS ... so ...

smedefind
2018-10-31 17:16
Sorry, no

smedefind
2018-10-31 17:17
`sledgehammer-9c1ad5cb7483928e6aba1d93ba363de929169f37.tar`

greg
2018-10-31 17:36
What was the output of the job log for the lldp task on the failing machine? @smedefind

smedefind
2018-10-31 17:37
Hm, maybe I?m not looking in the right place, but all I see for a job log RE lldp is

smedefind
2018-10-31 17:37
```Log for Job: 6d32c2e2-068c-4deb-a8e9-cd6b5e190df9 Machine 29b87bc7-483f-4d4b-b5fe-c4099637dbe9 changing from stage runner-service to sledgehammer-wait ```

smedefind
2018-10-31 17:38
It?s the same log for every task on that machine

greg
2018-10-31 17:39
It should be for task: `network-lldp`

greg
2018-10-31 17:42
No that is a stage change, not a task run.

greg
2018-10-31 17:42
There should be one after that

smedefind
2018-10-31 17:43
There is, but looks exactly the same, just a different Task name and Previous Job ID changes

greg
2018-10-31 17:43
okay - just a second

smedefind
2018-10-31 17:46
I should probably clarify the problem. I think the lldp task is working as it should.

smedefind
2018-10-31 17:47
lldpd on the sledgehammer instance just isn?t see any lldp traffic from the switch and the switch isn?t from the sledgehammer instance

zehicle
2018-10-31 17:47
thinks that I can add a "this is a transition task" to the UX

smedefind
2018-10-31 17:47
So lldp neighbors is blank as it should be according to what lldpd is seeing

greg
2018-10-31 17:48
hmm - okay

greg
2018-10-31 17:48
I have this in my job log for task network-lldp

greg
2018-10-31 17:48
This is running on a machine in http://packet.net

smedefind
2018-10-31 17:49
yeah, I only have the log shown above in all the tasks

greg
2018-10-31 17:49
Then something isn?t running right?..

greg
2018-10-31 17:49
it would seem

shane
2018-10-31 17:51
what are the NIC interface names in your bare metal machines, @smedefind

smedefind
2018-10-31 17:52
eth0 and en02 (Another weird issue, should be eno*)

greg
2018-10-31 17:53
eth0 is the default naming and it gets locked by netboot.

smedefind
2018-10-31 17:53
ah

greg
2018-10-31 17:53
En02 gets renamed by kernel

tom.gillman
2018-10-31 17:53
so, we just worked through a similar problem here. The card is swallowing the LLDP data because it has it's own agent in firmware. We had to disable that to make the LLDP data visible to the OS

tom.gillman
2018-10-31 17:54
if it's an Intel i40e card

greg
2018-10-31 17:54
while using DRP, @tom.gillman?

tom.gillman
2018-10-31 17:54
yes

smedefind
2018-10-31 17:54
oh? I just had a flash back to this problem 5 years ago

smedefind
2018-10-31 17:54
yeah, I remember having to do that too

tom.gillman
2018-10-31 17:54
We made a modified version of the lldp script to force that to turn off

greg
2018-10-31 17:55
Can you share that? I?ll put it in the tree if y?all don?t want to do PRs?

smedefind
2018-10-31 17:56
yup, that fixed it

smedefind
2018-10-31 17:56
@tom.gillman Thanks

greg
2018-10-31 17:56
@smedefind - what did you end up doing?

smedefind
2018-10-31 17:57
`echo lldp stop > /sys/kernel/debug/i40e/0000\:1a\:00.0/command`

smedefind
2018-10-31 17:57
Would need some massaging to be scripted

greg
2018-10-31 17:57
yeah, but that is okay.

greg
2018-10-31 17:59
thanks - I?ll add an issue for it shortly.

smedefind
2018-10-31 17:59
Setting Management Address to the IPMI address would be nice too

zdunn
2018-10-31 17:59
Sigh - Intel was helping!

greg
2018-10-31 18:00
The ipmi plugin has stages to discover and/or configure that.

greg
2018-10-31 18:01
Then we could change the lldp start script to set it.

tom.gillman
2018-10-31 18:03
This is what we added

christopher_wood
2018-10-31 18:06
Newb question again, sorry. Which bit of the documentation discusses how I would populate /etc/dr-provision, and/or what format I would use for custom stuff that should be read on daemon start?

shane
2018-10-31 18:07
thanks, @tom.gillman

tom.gillman
2018-10-31 18:07
We went ahead and brought all of the interfaces online up front, and we're using LLDP for cable validation

zdunn
2018-10-31 18:07
@tom.gillman yeah I think that's part of what we want here as well

zdunn
2018-10-31 18:08
we need to be able to create Port Channels on the switches

zdunn
2018-10-31 18:08
and we've had incidents of incorrect cabling

shane
2018-10-31 18:10
@christopher_wood - what are you trying to do with `/etc/dr-provision` ??

christopher_wood
2018-10-31 18:11
@shane, the idea is to provide some things available from daemon startup, like pre-generated machine definitions, bootenvs, that sort of thing. Not sure if I should be stuffing those in files or adding them right after dr-provision is started.

christopher_wood
2018-10-31 18:12
So if we lose a blade and the VM running dr-provision dies, nobody has to load all that stuff back in manually after VMWare boots the host up again.

shane
2018-10-31 18:14
no - you shouldn't touch those directory structures - bootenvs, stages, tasks, workflows, templates ... etc - all should be handled as "content" - organized in "packs" - we have an example Content Pack, called colordemo: https://github.com/digitalrebar/colordemo

shane
2018-10-31 18:14
there are videos and help information in the README how to use it

shane
2018-10-31 18:15
The preferred method for Machines - is to use the Discovery stage in a Workflow - to allow the `sledgehammer` image to run, and do all of the right things to create a machine, and allow you to do physical hardware preparation via additional workflows (if desired)

christopher_wood
2018-10-31 18:16
Thanks, sounds way better than managing files. Youtube with lunch then.

shane
2018-10-31 18:16
our Quickstart uses the CLI to walk you through all of this process

shane
2018-10-31 18:16
you can find our example Discovery workflow in the quickstart, at: https://provision.readthedocs.io/en/tip/doc/quickstart.html#create-a-workflow

shane
2018-10-31 18:17
check the $faq for tips on how to add SSH keys - so you can SSH in to the Sledgehammer/Discovery booted machine - if you need to have a "look see" around the environment


christopher_wood
2018-10-31 18:18
The quickstart worked for me out of the box, but I don't think I quite have my head around the rest. Bookmarking all those, thank you.

shane
2018-10-31 18:19
Cool - start w/ the Colordemo - which is the best way for building custom content to do "interesting" things for your own use case ... it teaches you how to write/build/manage content as Git versioned artifacts that make it easy to work on (collaboratively)

greg
2018-10-31 18:19
Thanks, @tom.gillman - that is what I was going to write.

tom.gillman
2018-10-31 18:19
No problem.

greg
2018-10-31 18:20
@tom.gillman - how are you choosing to validate the cables? Are doing it outside of the system or did you build a task validator?

tom.gillman
2018-10-31 18:21
It's external for now. Once the machine record is populated with the LLDP info, we have a python script that runs and makes some assumptions, then produces a CSV with a PASS/FAIL

greg
2018-10-31 18:21
okay - for a different customer (along time ago), I built tasks for that so the machine would ?fail? and not continue.

greg
2018-10-31 18:22
With DRP plugins, you could actually drive it directly. hmm integrated with a switch hmmm.

tom.gillman
2018-10-31 18:23
It's on the list for inclusion later on. We needed a quick and dirty to be prod ready

zehicle
2018-10-31 18:55
@tom.gillman the inventory stage has the ability to 1) run a command and parse keys from json 2) stuff those into params and 3) validate that they values match a regex

zehicle
2018-10-31 18:56
so you could do an LLDP collect, jq select data and verify it was correct.

zehicle
2018-10-31 18:56
automatically checks that it did not change too

tom.gillman
2018-10-31 19:01
I don't dispute that, and part of it might be that DRP is new to us and I was unaware we could do that. The question is finding time to go back and read all the docs and watch all the videos. It was just easier to hack out a script.

tom.gillman
2018-10-31 19:05
We're trying to make it as generic as possible, though. Something along the lines of "Oh, this is where this host has nics plugged in, based on <some math> they should be here" or "this guy is plugged into a port that isn't in the set of ports he should have"

tom.gillman
2018-10-31 19:07
Do we want a fail at that point? maybe, maybe not. We might want to go ahead and finish the build and note it as a "Hey check this out", and do whatever swapping we need to do and let the switch relearn it.

zehicle
2018-10-31 19:24
both sad and happy to hear that we have so much documentation that it's too much

zehicle
2018-10-31 19:26
@tom.gillman make sense. we've heard enough stories that there are some pretty common patterns and we've been coding to that. Don't be shy about asking if there's a short cut

seaton
2018-11-01 05:32
@shane Thanks for the headsup, I?m not using a custom `kubeadm.cfg` just as per the docs which is the cloned example krib profile with the necessary changes. Have now added `krib/cluster-kubernetes-version` to the cloned profile and see where that takes me :slightly_smiling_face: