2018-06-04 10:52
I got the terraform provider to build and init, but I'm getting an error when running:

2018-06-04 10:52
Error: provider.drp: Internal validation of the provider failed! This is always a bug with the provider itself, and not a user issue. Please report this bug:

2018-06-04 11:11
I didn't see the prebuilt binary for some reason. Using that now.

2018-06-04 13:52
I now have a node which gets to terraform-ready ok, then when I run terraform setting workflow to one I've created, a simple ubuntu-18.04 install -> complete, I get the following after the node reboots: "Booting kernel failed: Invalid argument"

2018-06-04 13:54
The workflow is set correctly, and the Description is too. The metadata does not get set though. I've tried adding a 'machine-metadata' stage before the ubuntu-install stage but it's the same result

shane
2018-06-04 13:54
@ - do you have your Bionic Beaver install working on it's own - without dealing with the Terraform layer added in there ?

2018-06-04 13:55
Humm I tested it on another box but haven't on this one tbh

2018-06-04 13:55
I will do so

2018-06-04 14:01
one thing I'm a little unclear on is why can't you change the boot environ in bulk tasks, but you can change stage?

2018-06-04 14:04
ah so the ubuntu-18.04-install bootenv fails on it's own, even without terraform/workflows. I assumed it would work as it did on another box I'm testing on at the moment (at home).

shane
2018-06-04 14:14
The BootEnvs are supposed to be manipulated via Workflow changes --> which changes the stage of a machine - a Stage defines the BootEnv as one of the things that it manipulates

shane
2018-06-04 14:14
If you enable Stages (by setting defaultStage) - then BootEnvs on their own don'ttransition machines any more

2018-06-04 14:15
ok so "stages & workflows" are kind of a separate optional method of managing the box which relegates the bootenv to a second-class citizen. Kinda!?

shane
2018-06-04 14:16
since the designed path to modify a Machines state is Through Workflow -> Stages -> BootEnvs - it didn't make sense to leave the Bulk Actions cluttered with too many extra options

shane
2018-06-04 14:16
Sure - that works as an analogy - but BootEnvs are still necessary/important - they're just driven ultimately via Workflows

2018-06-04 14:16
right ok

shane
2018-06-04 14:17
The "old" workflow system (`change-stage/map` Param) was a second class citizen. In ... somewhere around v3.7.0 era - we rebuilt the workflow system giving `Workflows` first class citizenship

2018-06-04 14:52
thanks. was my fault the boot wasn't working -- server was out of storage :man-facepalming:

shane
2018-06-04 14:53
No worries - it's nice when it's a simple issue :slightly_smiling_face:

2018-06-04 14:54
@shane Is there a way how I can specify different `pxelinux.0`?

2018-06-04 14:56
I cannot find where is the `lpxelinux.0` specified to be loaded when using pxelinux ... I would like to override the setting to load my own `pxelinux.0`

2018-06-04 14:59
ah ... it is the DHCP option 67 in the subnet definition?

shane
2018-06-04 14:59
Initially yes - then the BootEnv has PXE templates defined which change the behavior

shane
2018-06-04 15:00
I'd like to ask why you need something custom - we do a LOT of autodetection of broken PXE/UEFI implementations and dynamically serve PXE information based on that

shane
2018-06-04 15:00
if you use a custom PXE then you may loose a lot of goodness we do - if it's a behavior issue, lets discuss that, and it's possible it might be something worth/needing to roll in to our PXE pieces

2018-06-04 15:01
your `lpxelinux.0` is v6.03 ... I need v3.86

shane
2018-06-04 15:01
ok - lets get @vlowther onboard with this discussion

2018-06-04 15:01
VMware ESXi requires this version otherwise it won't load the `mboot.c32` file

shane
2018-06-04 15:02
ah yes vmware fun ...

vlowther
2018-06-04 15:03
@ We have it -- our esxi bootenvs chainload from lpxelinux.0 to esxi.0 with all the appropriate args.

vlowther
2018-06-04 15:03
esxi.0 is pxelinux 3.86

2018-06-04 15:04
so how can I initiate the `esxi.0`?

vlowther
2018-06-04 15:05
Are you using our esxi bootenvs, or ones you have written?

2018-06-04 15:06
I don't see any esxi bootenvs. I was going to ask that myself

vlowther
2018-06-04 15:07
Yes, they are RackN licensed, and not available in the default community content.

2018-06-04 15:08
I have created my own ESXI bootenv ... where can I find yours?

2018-06-04 15:08
I cannot see it in the Catalog ...

shane
2018-06-04 15:09
correct - since it's RackN commercial you'd have to have a license to use it

2018-06-04 15:10
I see

shane
2018-06-04 15:10
you can however do a commercial 30 day trial with us - and we can provide you a 30 day trial license to test it (and more things)

2018-06-04 15:10
so what's the trick to employ the `esxi.0` from `lpxelinux.0`?

vlowther
2018-06-04 15:10
chainload it.

2018-06-04 15:12
is there some other example for the chainloading? ... I have no idea what and where should be chainloaded ...

florent.wagener
2018-06-04 15:17
hey guys, quick question. I am working on a script to monitoring drp. So I am using `drpcli info status` to gather the services health status. Of course this command doesn't work if the service dr-provision isn't working and I get an error like this: ```Error: Error creating session: CLIENT_ERROR: Get https://127.0.0.1:8092/api/v3/users/rocketskates/token: dial tcp 127.0.0.1:8092: getsockopt: connection refused``` which is completely fine. My question is, is there any more error case that I should consider in my script ?

florent.wagener
2018-06-04 15:19
Looking at the code of `info.go` it doesn't seem so but I am not really an expert in Go :slightly_smiling_face:

vlowther
2018-06-04 15:21
No, assuming you can talk to the info API at all, the rest of the failure cases will be standard auth failures and the returned struct telling you about what ports are reachable and what you should expect to be able to reach.

florent.wagener
2018-06-04 15:21
alright perfect.

2018-06-04 15:28
@vlowther any hint on the chain loading?

shane
2018-06-04 15:31
@ we're in meetings this morning, we'll get back to you in a bit

vlowther
2018-06-04 16:00
@ Best way would be to sign up for a 30 day trial license for the os-other content, then take a look at how we did it once you have access.

vlowther
2018-06-04 16:01
Shortest path for that is to send an email to detailing what you need.

florent.wagener
2018-06-04 17:56
is there a way to command the services individualy (BINL, API, DHCP, Static, TFTP) from drpcli ?

shane
2018-06-04 17:57
@florent.wagener - what did you have in mind relating to "command"? each of the services are either enabled/disabled via startup flags, then the appropriate API/CLI calls make changes to the individual services

shane
2018-06-04 17:57
DHCP is related to Subnets, Reservations, and Leases

florent.wagener
2018-06-04 17:57
just start and stop them

florent.wagener
2018-06-04 17:58
because when I disable my subnets, and ran `drpcli info status` DHCP still appeared enable and active

shane
2018-06-04 17:58
yes - via the startup flags to the `dr-provision` start options

shane
2018-06-04 17:59
```drp@pixie:~$ ./dr-provision --help | grep disable --disable-tftp Disable TFTP server --disable-provisioner Disable provisioner --disable-dhcp Disable DHCP server --disable-pxe Disable PXE/BINL server```

vlowther
2018-06-04 17:59
yeah. Disabling or deleteing subnets does not stop the DHCP server.

florent.wagener
2018-06-04 17:59
okok got it

vlowther
2018-06-04 18:00
It merely stops the DHCP server from responding to requests of any kind for that subnet

florent.wagener
2018-06-04 18:01
I was running dr-provision through systemd and didn't check the options

florent.wagener
2018-06-04 18:01
:smile:

florent.wagener
2018-06-04 18:08
Last dumb question: is this possible to have a service `Alive` but not `Enabled`(and vice-versa), if yes how can I test it ?

shane
2018-06-04 18:11
No - I don't think so - `enabled` refers to it being enabled in the start up options - while Alive is an active health check that is run to verify the service itself is responding

florent.wagener
2018-06-04 18:31
well, thanks to all these informations I am now monitoring DRP services :slightly_smiling_face:

2018-06-05 13:20
has joined #community201806

2018-06-05 13:23
Hello, I'm new to DR (and to slack altough I'm using Gitlab's mattermost).

2018-06-05 13:23
Is this the place to ask some basic question about Digitar Rebar ?

shane
2018-06-05 13:24
yes it is ... and $welcome @

2018-06-05 13:24
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-05 13:24
ok thanks

shane
2018-06-05 13:25
it'll be a short bit until we can respond, we have a daily stand-up meting starting now

shane
2018-06-05 13:25
but plz ask your questions

2018-06-05 13:25
I'm considering switching from XCAT to some other provisioning tool for the initial bootstrap of HPC cluster servers

2018-06-05 13:26
Before XCAT we did use cobbler but XCAT was cool because it offers discovery

2018-06-05 13:27
my understanding is that DR could be an XCAT alternative (as The Foreman is) and I'm trying to install it on a test VM/VLAN/subnet

2018-06-05 13:28
the VM I'd like to run it on has 2 network interfaces in 2 different subnets and I want to be sure to start DHCP *only* on the correct one

2018-06-05 13:28
is the 'define a subnet in json' section the answer in the doc to that ?

2018-06-05 13:30
in short : eth0 <-> vlan A (which may receive DHCP request for the whole site dhcp server) eth1 <-> vlan B : I want DR to answer dhcp request only on this interface

shane
2018-06-05 13:30
Yes - "define subnet" does this for each Network Pool you want to service for DHCP

shane
2018-06-05 13:31
each `Subnet` defined can be switched on or off to define whether or not to answer DHCP queries for hosts coming from that subnet

shane
2018-06-05 13:31
You can also use the Web Portal to do all of that - and Subnet creation is easier via Portal: https://portal.rackn.io/

2018-06-05 13:32
but I'm talking about an on premise installation, right ?

shane
2018-06-05 13:32
yes - Portal can manage the on-prem DRP Endpoints (services)

2018-06-05 13:32
ok thanks

shane
2018-06-05 13:32
or API or CLI

shane
2018-06-05 13:33
or for commercial users that need full airgap capability, you can run the Portal on-prem as well - that's a commercial license

2018-06-05 13:33
I've got 2 other simple questions at this point.

2018-06-05 13:35
XCAT discovery allowed switched-based discovery which was very useful for us because we couldn't trust people who reported us the MAC addresses

2018-06-05 13:35
So we only had to map hostname <-> switch/port into XCAT database

2018-06-05 13:36
[as a matter of fact, we don't want a random discovery because servers are grouped by 4 of them in a 2U chassis, so we want to control the naming sequence]

2018-06-05 13:36
Can DR provide a similar discovery process ?

2018-06-05 13:37
My second question is about the IPMI card configuration : my understanding is that it is possible but though a commercial DR plugin. Is this correct ?

2018-06-05 13:37
thanks

spector
2018-06-05 13:43
@ I have some more help coming for you in a bit (they are driving to office) as Shane had to head out this morning for a meeting. Stay tuned

2018-06-05 13:43
ok thanks. Do worry, I'm not in a hurry :wink:

spector
2018-06-05 13:44
Perfect. Appreciate your patience.

greg
2018-06-05 14:09
@ - I?m coming to get yelled at for this, but here we go.

greg
2018-06-05 14:10
There is a switch discovery stage using LLDP to get port info.

greg
2018-06-05 14:10
This stores the port info on the machine as parameters that can be then used to drive other actions. The UX has a switch viewer.

zehicle
2018-06-05 14:11
notes that @greg is *supposed* to be on vacation

greg
2018-06-05 14:12
With regard to sequencing, the bring up can be sequenced or ?fixed? after the fact. For your naming desire, I would a naming task in a discovery stage that would take the LLDP info and what ever else and rename the node to your desired name. That way post discovery you would have a correctly named node ready for consumption.

greg
2018-06-05 14:14
With regard to the second question, IPMI plugin provides stages that can be used to configure an access user and IP address. These range from DHCP or Static or DHCP reserved to Static. These tasks set access parameters on the machine that enable the power control actions.

greg
2018-06-05 14:15
With regard to your zeroth question, The discovery process tries to inject the booting MAC into the kernel lines so that at least sledgehammer BootEnv can stay on the same DHCP subnet as it goes through the boot discovery process.

2018-06-05 14:15
Thank you greg. We indeed do something similar witch what XCAT calls a 'chain'. One element of the chain is to configure a static ip and user passwd for the ipmi.

greg
2018-06-05 14:16
Okay - then it is very similar but added as a stage in the discovery workflow.

greg
2018-06-05 14:16
for us.

2018-06-05 14:16
My concern about dhcp was not to wreck havoc in other non-test subnets.

greg
2018-06-05 14:18
okay - then for that, it is all good. DRP can be configured with only the subnets you want and then direct connected or DHCP relayed. Additionally, DRP can be run as a proxy in parallel to the existing DHCP server, or use the results of an external DHCP server. Lots of options with little minor details of config depending upon choice.

2018-06-05 14:21
is ipmi and LLDP in the commercial zone or is it provided in the base DR product ? In fact I can sum up my needs like so : - switched based discovery - being able to name hosts according the their port location in order to match the sequence in the chassis - being able to configure IPMI at discovery time - being able if possible to update BIOS firmware (we did that running a - often bogus - propriatary utility - being able to install an OS with some ssh keys or user setup in order to escape as soon as possible to an ansible config of the host

2018-06-05 14:21
As a matter of fact, DR seems to suit all of this

greg
2018-06-05 14:24
Bios Firmware will be based upon hardware type. Though the system is pluggable or DRP could sequence your utility. The nicety with DRP is that a maintenance workflow could be created to offline the node, update it, then reenable it.

greg
2018-06-05 14:24
IPMI, BIOS, and LLDP are licensed components.

2018-06-05 14:24
I forgot, we may leave this to go back to stateful nodes but can we use DR to manage an OS image the node's would PXE on in the case of stateless compute nodes (OS in a ramdisk) ?

greg
2018-06-05 14:24
That workflow can control is my basic test work flow.

2018-06-05 14:25
I mean not only provide an image in the PXE environment but tools to (re)built it...and so on ?

greg
2018-06-05 14:25
That is what sledgehammer is. We also have a kubernetes demo that is that exact use case. We boot sledgehammer and bringup k8s in that environment.

2018-06-05 14:25
[sorry for the basic questions]

greg
2018-06-05 14:26
Oh - RackN has (in various states for completeness) tools to build in-memory images, capture system images, or build full system images. All of these could be installed with the image-deploy tool set or booted from a BootEnv as a diskless image

greg
2018-06-05 14:28
@shane?s been working on the image capture and generalizing @vlowther?s sledgehammer build environments. We now can (need to switch over to it on the next sledgehammer update) build sledgehammer as a set of stages in DRP.

2018-06-05 14:31
What we are doing for now is : we have some stateful nodes (the "admin" nodes,i .e. admin nodes, drm node, accounting nodes) and stateless compute nodes. But it's kind of hard to manage because in short when we deploy something with ansible on the "live" nodes, we play the same roles inside the corresponding chroot and just use an XCAT command which packs the image. But working with chroots causes some difficulties and may be not worth it :wink:

2018-06-05 14:32
It may be outside the scope of DR the way the image is built

2018-06-05 14:33
Thanks all for your time. I'll try to play with a test installation at this point.

2018-06-05 14:33
What about the commercial vs free features ? Are IPMI and LLDP commercial "plugins" ?

greg
2018-06-05 14:34
Yeah - we are finding that chroot hopping is useful for building some things.

greg
2018-06-05 14:35
We are talking about a drpcli feature that @vlowther will probably look into that builds and captures images from chroot or atleast can run tasks in chroots.

2018-06-05 14:39
Maybe a more general question : I previously understood (maybe I was wrong) that DR was kind of "meant" to fit in a wider picture, such as to be underneath Terraform for instance. Does it makes sense to use DR only for hardware initial bootstraping (as I explained) or is it kind of overkill ? What would someone for instance choose DR over let's say the Foreman for this only use ?

2018-06-05 14:41
[the fact that the foreman implies Ruby on rails is no fun to me though]

2018-06-05 14:42
Sorry I didn't see you answered my commercial vs free question. My mistake

vlowther
2018-06-05 14:43
We are happy for people to use DR for initial hardware bootstrapping. :slightly_smiling_face:

vlowther
2018-06-05 14:48
A significant number of the extra fatures drp-provision has grown started as ways to make all the various tasks needed to discover and configure the servers easier to write, refactor, and maintain. That they also enable us to do things like stand up a Kubernetes cluster from scratch in 5 mins (not including reboot time) is an spiffy added bonus. :slightly_smiling_face:

2018-06-05 14:50
Thank you. I'll give it a try. I appreciate the time you spent to clarify all this and congratulation for the work.

2018-06-05 14:54
I'm trying to provision ESXi box with the templates from `os-other` content package and I'm not able to get pass the stage where the PXE is trying to load the `pxelinux.cfg/<UUID|HexIP>`. The UUID I can see in the PXE is different from the Machine UUID. Any idea what could be wrong?

vlowther
2018-06-05 14:58
The UUID that PXE tries is the DHCP Client ID, which we don't use at all.

vlowther
2018-06-05 14:59
We only write pxelinux config files for the IP address in hex form and the mac address we expect the system to boot from (if known)

vlowther
2018-06-05 14:59
hm.

2018-06-05 15:04
I see. Well the HexIP is not getting any config as well. Is there a ways how I can check that the file is available?

2018-06-05 15:05

2018-06-05 15:07
I have a Workflow called `esxi` which has two Stages - `esxi-install` and `complete`. The `esxi-install` stage employs the `esxi-670-install` BootEnv which is a clone of your `esxi-650a-install` with changed ISO and SHA256 sum ...

vlowther
2018-06-05 15:15
Huh. What is your default bootenv? When it hit pxelinux.cfg/default the system should have booted into Sledgehammer.

vlowther
2018-06-05 15:16
Also, what is your unknown bootenv?

2018-06-05 15:17
My Default BootEnv is `local`.

2018-06-05 15:18
And my Unknown BootEnv is `discovery`.

vlowther
2018-06-05 15:18
ok

vlowther
2018-06-05 15:20
Has the machine you are trying to install been through discovery?

vlowther
2018-06-05 15:20
I assume there is a machine config for it...

2018-06-05 15:23
no, it's a fresh Machine created by: ``` echo '{ "Name": "test1", "Description": "", "Address": "192.168.100.20", "Stage": "esxi-install", "Runnable": true, "Workflow": "esxi" }' | drpcli machines create - ```

vlowther
2018-06-05 15:25
hm

vlowther
2018-06-05 15:26
What does `drpcli machines show Name:test1` return?


vlowther
2018-06-05 15:30
ok

vlowther
2018-06-05 15:30
Is dr-provision handling DHCP?

2018-06-05 15:31
yes

2018-06-05 15:32
I can kickstart CentOS on the same Machine if I switch to a different Workflow ...

vlowther
2018-06-05 15:33
hm


2018-06-05 15:58
@ something is setting TFTP Prefix for your ESXI install

2018-06-05 16:00
i'm pretty sure that shouldn't be /esxi/install when trying to pull the pxelinux.cfg files

2018-06-05 16:12
@ajones the `esxi/install` corresponds to my Workflow called `esxi`

2018-06-05 16:13
I can query all the configs via `curl` or `tftp` so all should work

2018-06-05 16:13
It looks like it's some issue with the VirtualBox PXE ...

vlowther
2018-06-05 16:18
@ Our ESXi bootenvs use tftp prefixes to ensure that the pxelinux paths that pxelinux 3.86 uses do not overlap with the paths that pxelinux 6.04 use

vlowther
2018-06-05 16:20
since 3.86 is required for esxi, and uses com32 binary modules instead of the elf ones that syslinux 5.0 and later use.

2018-06-05 16:28
@vlowther Ah, I should have guessed that ESXi would need special handling

vlowther
2018-06-05 16:29
It is an annoyance one has to live with when dealing with trying to netboot ESXi.

shane
2018-06-05 16:38
In a little over an hour we will be hosting our 19th Meetup for Digital Rebar. We hope to see you all there! We'll be demoing Kubernetes deployment from bare metal via the KRIB (kubeadm) pattern, have an introduction and discussion on the upcoming Network Configurator tool for Workflow use, and as always - Community discussion time. Meetup information is at: https://www.meetup.com/digitalrebar/events/lchdhpyxjbhb/

2018-06-05 18:02
I?m just lurking :slightly_smiling_face:

spector
2018-06-05 18:06
Not a problem at all. Nothing more fun then lurking in open source

2018-06-05 19:47
How do we pull facts in rebar about the node that was provisioned with sledgehammer?

shane
2018-06-05 19:48
are you referring to information in the Inventory (gohai-inventory) JSON blob - or you want to have "custom facts" related to a machine ?

2018-06-05 19:50
dmidecode information or how i can upload a script so it runs.

2018-06-05 19:51
Im trying to create a workflow that calls the stackstorm api with the facts gatehred by sledgehammer

shane
2018-06-05 19:51
ah - so you basically will want to duplicate / customize a Stage - if you follow the elements of a Stage via the Portal, you can see how the pieces are put together

shane
2018-06-05 19:52
ultimately - it leads to a Template - which can be anything - in this case it might be a Bash script, Python, or similar - you can take the JSON output of the `gohai-inventory` and slice/dice it however you want and use that to call the StackStorm api

shane
2018-06-05 19:55
I haven't evaluated this - but there is a github repo that has some basic Stackstorm plus device42 stuff - you can ignore the device42 pieces - but you might get some ideas from it: https://github.com/deusofnull/st2-digital-rebar

shane
2018-06-05 19:56
basically - a Stage refers to Tasks that define something to do - and a Tasks refers to Templates - templates can be Bash or whatever, and it is also golang templatized - so you can make flexible/reusable and dynamically changing content pieces based on Params (variables) and Profiles (collections of Params)

2018-06-05 20:05
Thank you @shane

shane
2018-06-05 20:06
hopefully that makes sense? feel free to ask questions if not :slightly_smiling_face:

2018-06-05 20:07
by setting up a new machine with sledgehammer, the only data I see is mac address and ip address with uuid

shane
2018-06-05 20:10
You'll need to make sure that Sledgehammer runs Discovery process - and that the `gohai-inventory` task runs

shane
2018-06-05 20:10
if you are NOT running Sledgehammer - you can initiate the `gohai` inventory via the `drpcli` client side binary

2018-06-05 20:11
currently running sledgehammer

shane
2018-06-05 20:11
the `discover` stage runs the `gohai-inventory` task - or, it is supposed to :slightly_smiling_face:

2018-06-05 20:11
is this information suppose to be seen in the machine info

shane
2018-06-05 20:11
if it does - when you iterate the Machine object - you should see a Param named `gohai-inventory` which is a large JSON structure of the machine inventory


2018-06-05 20:18
so i see gohai tasks in the ui

2018-06-05 20:18
but i do not see data

shane
2018-06-05 20:28
The `gohai-inventory` is a Param - that should be listed further down on the page


shane
2018-06-05 20:34
that's one of my Machines - you can also pull the UUID of the machine, and via the CLI - you can do: `drpcli machines show Name:<Machine_name> | jq '.Params."gohai-inventory"'`

shane
2018-06-05 20:35
or `drpcli machines show <UUID> | jq '.Params."gohai-inventory"'`

2018-06-05 22:29
qq, How hard or easy is it to extend gohai or is it better to write a python script

shane
2018-06-05 22:30
gohai is written in ... Golang ... so extending it isn't exactly hard - but it's not easily exposed to make changes to it

2018-06-05 22:30
I need to collect lldp information

shane
2018-06-05 22:30
the larger question would be ... what do you wish Gohai did... that it doesn't do today ?

shane
2018-06-05 22:31
ah - we have RackN supported LLDP switch content today

2018-06-05 22:31
to see what ports it is connected to

shane
2018-06-05 22:32
yep - which is why we have the LLDP content - it is also planned to allow you to map out and show physical switch port info - but that's a future piece - today it shows the discovered LLDP info on each Machine - adding Params with the Switch Port details

2018-06-05 22:34
how do I add that behavior?

shane
2018-06-05 22:37
It's available for Registered users of the Portal - via the Portal, go to Contents, and add the `task-library` content - you should see some new stages -namely `network-lldp`

shane
2018-06-05 22:37
add that Stage to one of your workflows - generally speaking your Discovery workflow - though you'll need to "re-Discover" your Machines if they were run without that Stage previously

2018-06-05 22:48
got it and testing now

2018-06-05 22:48
ty very much for pointing me in the right direction

shane
2018-06-05 22:54
sweet !

2018-06-06 00:08
It didn't collect any data

2018-06-06 00:08
I'm going to run the lldp command manually

shane
2018-06-06 00:09
@zehicle might be able to shed some light on this

zehicle
2018-06-06 02:25
Lldpd needs to be running on the host or switch. It's not a passive thing.

zehicle
2018-06-06 02:26
For vms, run it on the host

2018-06-06 13:44
has joined #community201806

2018-06-06 16:12
```Log for Job: ce35da65-2e64-4afb-8845-26fe04f16bcd Starting task network-lldp on 66f80e7e-383f-459a-aef4-1e7e37222fce Starting command ./network-lldp-network-lldp-start.sh.tmpl Command running Checking and starting LLDPD if necessary ... Not running. Starting ... Created symlink from /etc/systemd/system/multi-user.target.wants/lldpd.service to /usr/lib/systemd/system/lldpd.service. Sleeping for '20' seconds to collect LLDPD data... 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Finished successfully Command exited with status 0 Action network-lldp-start.sh.tmpl finished Starting command ./network-lldp-network-lldp-client.sh.tmpl Command running Collecting Neighbors information from LLDPctl {} Finished successfully Command exited with status 0 Action network-lldp-client.sh.tmpl finished Task network-lldp finished Updated job ce35da65-2e64-4afb-8845-26fe04f16bcd to finished Task signalled that it finished normally```

2018-06-06 16:12
@zehicle :point_up: output of job

shane
2018-06-06 18:05
@ $welcome :slightly_smiling_face:

2018-06-06 18:05
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

zehicle
2018-06-06 21:19
@ it looks like LLDP is not returning any data. That's normal if you don't have lldpd running on the host or switch. What happens if you run the command by hand on the system?

zehicle
2018-06-06 21:20
It's just running "lldpctl -f json" and then passing that into a param

zehicle
2018-06-06 21:21
actually, it starts lldpd ("systemctl start lldpd")

zehicle
2018-06-06 21:21
then waits and then runs lldpctl

dave.parker
2018-06-06 21:43
If I have a custom task that runs a template, what is the easiest way to see what that template looks like for a particular machine?

zehicle
2018-06-06 21:57
I think setting the debug level to highest is the easiest

zehicle
2018-06-06 21:57
@vlowther may know about a flag for dry run or other options

zehicle
2018-06-06 21:58
the challenge is getting the system into the right stage - you'll need to set the stage and clear the workflow

vlowther
2018-06-06 22:00
Our design renders templates on demand. We don't really have a better way to see what a tempalte will look line beyond cranking renderer logs al lthe way up and trying to run the task in question.

dave.parker
2018-06-06 22:00
Ok

shane
2018-06-06 22:01
@dave.parker check the item 22.18 in the $faq


shane
2018-06-06 22:03
that discusses rendering kickstart/preseed - using a Phantom host to trick a system in to rendering them - doesn't address generic templates though

shane
2018-06-07 18:33
- we had a great meetup earlier this week, covering the Kubernetes Cluster deployment with our KRIB tooling. We introduced the concepts and directions for our _Net Wrangler_ tool ... and as always, community questions and answers. On next meetup on June 19th we'll continue with a demo of our Kubernetes HA cluster builds with KRIB-HA, also demo Net Wrangler in operations, and also announce the v3.9.0 enhancements, features, and bug fixes. We look forward to seeing you there. Check out the meetup page for all details and RSVP: https://www.meetup.com/digitalrebar/events/lchdhpyxjbzb/

2018-06-07 23:00
has joined #community201806

shane
2018-06-07 23:49
@ $welcome

2018-06-07 23:49
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-11 16:04
has joined #community201806

shane
2018-06-11 16:13
@ $welcome

2018-06-11 16:13
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-11 17:36
Thanks @shane. Been following Rob on and off since Crowbar days. Worked in the private cloud space as an intern at Nimbula and am kicking tires again.

zehicle
2018-06-11 18:47
@ glad to have you here! It's been a wild ride since those days - still amazed that SUSE kept that project going.

dave.parker
2018-06-11 21:57
I have an issue with a script that tries to do an apt-get install for some extra packages after an ubuntu install. The script just does an apt-get update and an apt-get install, but the package(s) fail to install because apt cant resolve some of the dependencies. But the same script runs fun after a reboot. What's the difference between the environment that scripts started by the runner run in and the environment of the freshly rebooted system?

shane
2018-06-11 21:59
apt sucks in chroot environments

shane
2018-06-11 21:59
this is the pattern I generally use in non-interactive environments to try and get APT to play nicely: ```# enable non-interactive installs export DEBIAN_FRONTEND=noninteractive cat > /usr/sbin/policy-rc.d << EOF #!/bin/sh echo "All runlevel operations denied by policy" >&2 exit 101 EOF chmod 755 /usr/sbin/policy-rc.d # generic update to the platform apt-get update apt-get -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" dist-upgrade```

shane
2018-06-11 21:59
followed by: ```rm -f /usr/sbin/policy-rc.d``` after done

shane
2018-06-11 22:00
(replace the "dis-upgrade" with whatever you're actually trying to do)

dave.parker
2018-06-11 22:00
Ah yeah, I think you showed me this once before when I had a similar problem that I ended up solving by not installing the packages. :slightly_smiling_face: Let me see if I can get it to work

shane
2018-06-11 22:00
:slightly_smiling_face:

dave.parker
2018-06-11 22:01
I'm trying to avoid creating a new bootenv to put the packages in a pgksel command. Which works, but. Eww.

shane
2018-06-11 22:01
I think @vlowther had some other APT magic that might need added within the chroot environment

shane
2018-06-11 22:02
hmm - I was working with a pattern to update the bootenvs to add User added packages via a Param/Profile applied to the Machines to inject in to the KS/Seed dynamically

vlowther
2018-06-11 22:02
Yeah, whenever I want to install extra packages I tend to do it as a post install task.

shane
2018-06-11 22:03
which can be done by just writing a simple stage/task/template and adding in to the Workflow

vlowther
2018-06-11 22:03
Sorta.

vlowther
2018-06-11 22:04
If you are getting errors doing installs when running the post install scripts, there are a few things that can go wrong.

vlowther
2018-06-11 22:05
Most of them are bypassed by making sure you have conifgured repos properly and by using the package install helpers

greg
2018-06-11 22:05
tip has `extra-packages` parameter that is a list of strings that get added to the `pkg-sel` part of the netseed file.

vlowther
2018-06-11 22:05
Or that.

vlowther
2018-06-11 22:06
My usual real answer is to install packages just before you need them in whatever workload, and not as part of the OS install process.

vlowther
2018-06-11 22:06
:slightly_smiling_face:

shane
2018-06-11 22:07
I thought there was something about APT and stdin/stderr needing to be attached for some operations ... or something ...

shane
2018-06-11 22:08
right `extra-packages` ... that's what I was thinking of :slightly_smiling_face:

vlowther
2018-06-11 22:08
Yeah, so dpkg pre and post scripts do really silly things with fds.

vlowther
2018-06-11 22:09
for "over 20 years of legacy" reasons.

zehicle
2018-06-12 14:50
new Inventory stage! We're preparing for the v3.9 release and I wanted to share a recent addition to the task-library that adds often requested inventory features in a simple way. 1) SUMMARY - allows users to define a flat list of items pulled from GOhai (using jq filters) 2) INTEGRITY - allows users to halt workflow if the latest inventory does not match previous inventory as a way to detect tampering or changes 3) VERIFICATION - allows users to apply basic regex to inventory values to validate systems comply with expectations These settings are configured via params in a shared profile. The following video shows to to configure them.

zehicle
2018-06-12 14:50
The stage is in the task-library tip in advance of the release

zehicle
2018-06-12 14:51

dave.parker
2018-06-12 15:22
So there's an extra-packages parameter that adds user defined packages directly to the pkgsel line in the preseed available in tip?

dave.parker
2018-06-12 15:22
And also you removed the restriction on hostnames beginning with a numeral?

dave.parker
2018-06-12 15:22
I think those are two good reasons to update.

spector
2018-06-12 15:23
I have created a new Digital Rebar Community Welcome Guide for new and existing community members. The document has all the information in one place for getting started, understanding how the community works, etc. If you have any feedback for additional content please let me know. The file is available at http://u.rackn.io/dr_guide

zehicle
2018-06-12 17:50
The guide is very cool - good graphics and content about the community

dave.parker
2018-06-12 18:06
I just updated my test machine to the latest tip, and I see the "extra-packages" param but not where it actually gets used, which I assumed would be in net-seed.tmpl. Is it not fully implemented yet?

greg
2018-06-12 18:08
line 92 in net-seed.tmpl

greg
2018-06-12 18:08
Assuming tip drp-community-content.

dave.parker
2018-06-12 18:21
Ah. I didn't have tip drp-community-content installed

dave.parker
2018-06-12 18:21
I see it now.

dave.parker
2018-06-12 18:21
Thanks

2018-06-13 18:33
has joined #community201806

2018-06-13 18:59
are there existing BootEnvs for Atomic distributions?

spector
2018-06-13 19:01
@ $Welcome

2018-06-13 19:01
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-06-13 19:02
@ - in DRPv3 (current product) - no we don't have those

shane
2018-06-13 19:02
Not sure if we had anything like that for previous version

2018-06-14 14:53
on a scale of 1 to impossible, how hard would it be to write a custom dhcp strategy?

zehicle
2018-06-14 14:54
can you give some ideas about what you are thinking for the strategy?

greg
2018-06-14 14:58
@ - about halfway. :slightly_smiling_face:

2018-06-14 15:12
probably an API strategy or something like it

2018-06-14 15:14
what I want is to call out to the switch and assign the IP based on which port the MAC is on

2018-06-14 15:15
but thats not generic at all, so I figured it would probably be best to embed that logic in an API somewhere, and just call the API with a MAC

greg
2018-06-14 15:20
@ - have you looked at DHCP option 82?

greg
2018-06-14 15:28
If you have a switch capable of injecting Option 82 (one of subfields contains the network ingress information), then a strat could be written to use that info to hand out reserved addresses. It is almost like the strat system was designed for that. :wink: It needs more work and a place to test it.

zehicle
2018-06-14 15:30
:face_with_rolling_eyes: "almost like" (tm)

shane
2018-06-14 15:33
If you can do dynamic pool for sledgehammer boot, you can use a stage with the LLDP stuff to get port info and calculate final IP

vlowther
2018-06-14 15:35
Yeah, the whole reason we have a Strategy field in our DHCP related models is because a previous customer wanted to so something with option 82.

vlowther
2018-06-14 15:36
but "we want to do something with option 82" is as far as they got before other concerns became bigger, so we never did anything with that.

vlowther
2018-06-14 15:37
I have vague dreams of being able to attach some lua code to strategies to do their thing, but a vague dream is all it is right now. :confused:

zehicle
2018-06-14 15:38
<historian font>in some ways... option 82 was the genesis for DRP because the current v2 code base could not handle it or the repercussions</historian font>

2018-06-14 15:51
has joined #community201806

2018-06-14 15:57
It does look like our switches support option 82

2018-06-14 15:57
working on getting it enable to see what it spits out, but that will probably work

greg
2018-06-14 15:58
well - there is some DRP work to do. :slightly_smiling_face:

greg
2018-06-14 15:58
actually, if you could send us a DHCP log trace at DEBUG level with something booting through that it, it would be awesome.

spector
2018-06-14 15:59
@ $Welcome

2018-06-14 15:59
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

dave.parker
2018-06-14 19:54
I want to use a value set in the gohai-inventory in a bootenv template. What's the easiest way to do this? Specifically the value I want to get can be grabbed like this from the CLI:

dave.parker
2018-06-14 19:55
```$ drpcli machines show ae76c224-cb98-4b29-a555-470dd26da418 | jq '.Params["gohai-inventory"]["DMI"]["System"]["SerialNumber"]' "CQDRBM2"```

shane
2018-06-14 19:57
@dave.parker - you might want to look at the Inventory pattern that @zehicle just did

shane
2018-06-14 19:58
Inventory pattern ... @dave.parker

shane
2018-06-14 19:58
inventory vid

shane
2018-06-14 19:59
that'll allow you to congeal down the huge JSON blob to your defined set of key/value stuff you want - also you can do a check on hardware -vs- what you expect to be there ...

shane
2018-06-14 20:08
@dave.parker - if you're using either the Gohai or the new Inventory pieces, it depends where/how you're trying to access the JSON structure, the `drpcli` approach will work, but you can also use Templating to render the info inside of a Template for direct use - this is a LOT lighter weight than using `drpcli` because it reduces the external API calls and is done as part of the template rendering piece

shane
2018-06-14 20:11
you can do something like: `{{.ParamAsJSON "gohai-inventory"}}` or `{{.ParamAsJSON "inventory/data"}}`

shane
2018-06-14 20:12
these can then be used however you want inside your Template (bash, python, ruby, whatever)

dave.parker
2018-06-14 20:30
Ok, thanks

zehicle
2018-06-14 20:31
You can even loop over the results of a list like we make for inventory/data using range!

shane
2018-06-14 20:40
example Range template pattern: ```{{if .ParamExists "array-of-things"}} {{range $key := .Param "array-of-things"}} echo "item-in-array: '{{$key}}'" {{end}} {{else -}} echo "Param 'array-of-things' not available." {{end -}} ```

zehicle
2018-06-14 21:52
if you set good defaults in the params, you can even skip the .ParamExists and go right to range!

zehicle
2018-06-14 21:52
{{range} also has an {{else}} to handle the empty case. It's really cool stuff!

zehicle
2018-06-14 21:53
gushes of simplicity of golang templates (and knows that it has limits too)

2018-06-15 15:04
has joined #community201806

2018-06-15 15:23
- My coworker just called RackN / left a voicemail - he's got a consulting gig with a short window and wants to see if RackN meets his requirements. Any chance someone can reach out to him today?

2018-06-15 15:24
His name is Chris Harms.

2018-06-15 15:26
Got a Slack from him this morning: "Are you aware of any products that would be good for bare metal hardware automation? Specifically looking at firmware, bios, and IOS upgrade on Dell, Netapp and Cisco hardware."

2018-06-15 15:26
Time to feed the :bear:!

shane
2018-06-15 15:31
@ $welcome :slightly_smiling_face:

2018-06-15 15:31
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-15 15:32
Also, FWIW, @ - is not the same person. Just bad timing :slightly_smiling_face:

shane
2018-06-15 15:32
no problem

2018-06-15 15:57
has joined #community201806

greg
2018-06-15 16:01
@ $welcome - Lots of docs starting in the link below.

2018-06-15 16:01
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-15 16:02
thanks @shane

spector
2018-06-15 16:03
@ it?s just Chris day at Digital Rebar

2018-06-15 16:17
thanks all for the welcome I need to speak to someone about getting pricing for the product. Who can I talk too?

spector
2018-06-15 16:19
@ Saw your email and responded. Thanks

2018-06-15 18:11
@zehicle - thanks for connecting with CHarms!

zehicle
2018-06-15 18:11
@ happy to help out. sounds like an interesting use case.

zehicle
2018-06-15 18:19
we can setup off community slack rooms if you want private discussion

2018-06-15 18:29
Yeah, maybe we could setup an OST room with Chris and I + RackN peeps? That way when this RFQ gets a response - we can hit the ground running :slightly_smiling_face:

2018-06-18 19:04
Hoping someone can lead me in the right direction. I can't figure out why a global change-stage/map param isn't working. I'm trying to apply the discover -> sledgehammer-wait stage map to my discovered machines but it the stage never moves from discover.

2018-06-18 19:04
```? ~ drpcli profiles show global | jq { "Available": true, "Description": "Global profile attached automatically to all machines.", "Errors": [], "Meta": { "color": "blue", "icon": "world", "title": "Digital Rebar Provision" }, "Name": "global", "Params": { "change-stage/map": { "centos-6-install": "complete-nowait", "discover": "sledgehammer-wait:success", "ubuntu-16.04-install": "complete-nowait" }, "local-repo": true }, "ReadOnly": false, "Validated": true } ```

2018-06-18 19:04
```? ~ drpcli machines list | jq -r '.[] | "\(.Name) :: \(.Profiles) :: \(.Stage) :: \(.BootEnv)"' http://d94-57-a5-6b-37-18.ctilab.com. :: ["global"] :: discover :: sledgehammer http://d3c-a8-2a-1d-52-50.ctilab.com. :: ["global"] :: discover :: sledgehammer http://d00-50-56-bf-4d-71.ctilab.com. :: ["global"] :: local :: local http://d3c-a8-2a-1d-56-d0.ctilab.com. :: ["global"] :: discover :: sledgehammer```

2018-06-18 19:05
Is there a log I can look at to see why the stage isn't advancing to sledgehammer-wait?

shane
2018-06-18 19:09
@ typically this is due to an error in the `discover` stage - first step is to make sure you have the `sledgehammer` BootEnv installed and "available" - `drpcli bootenvs show sledgehammer | jq '.Available'` should return `true`

2018-06-18 19:09
Yep, I uploaded it ```? ~ drpcli bootenvs show sledgehammer | jq '.Available' true```

shane
2018-06-18 19:09
second - HIGHLY recommend you ditch the `change-stage/map` method of driving machines through Workflow - and take a look at `workflows` - which is available (oddly enough) - via the `workflows` menu item in the Portal

shane
2018-06-18 19:09
good

shane
2018-06-18 19:10
next step is to make sure you have the Prefs set correctly - if you can do `drpcli prefs list `drpcli prefs list | egrep "default|unknown"`

2018-06-18 19:11
I can do that....I was just trying to follow along with some documentation that leveraged the change-stage.map method

shane
2018-06-18 19:11
hmm - which Doc link? I updated the Quickstart a while back in `tip`


2018-06-18 19:11
It was the DRPv3 Training

shane
2018-06-18 19:12
ah - yes sorry - the training decks aren't updated yet :disappointed:

2018-06-18 19:14
I'll try the workflow method to see if theirs a change in behavior

2018-06-18 19:15
```? ~ drpcli prefs list | egrep "default|unknown" "defaultBootEnv": "sledgehammer", "defaultStage": "discover", "unknownBootEnv": "discovery", "unknownTokenTimeout": "6000"```

2018-06-18 19:16
prefs should be set correctly

shane
2018-06-18 19:16
Looks good

2018-06-18 19:18
Applied the discovery workflow and rebooting the bare metal servers now

shane
2018-06-18 19:18
check the physical (or remote) console for one of the machines

shane
2018-06-18 19:19
make sure it's booting the sledgehammer Stage 1 and Stage 2

shane
2018-06-18 19:19
if so - log in (root/rebar1) to the console and check the sledgehammer service log: `journalctl -u sledgehammer`

2018-06-18 19:20
I would note we are using hpe hardware if that matters

shane
2018-06-18 19:20
shouldn't matter

shane
2018-06-18 19:20
are you using DHCP from DRP, or external DHCP service ?

2018-06-18 19:20
DHCP from DRP

2018-06-18 19:21
Just booted into stage 1


2018-06-18 19:23
Doesn't like the hostname

shane
2018-06-18 19:24
does your domainname in the Subnet spec have a trailing dot (`.`) ?

shane
2018-06-18 19:24
that blows it up

shane
2018-06-18 19:25
ah - yes, looks yours has trailing dot - please remove that from the subnet spec then it should work

2018-06-18 19:25
will do

2018-06-18 19:25
learning pains

shane
2018-06-18 19:26
did you add the Subnet spec via the UX ?

2018-06-18 19:26
yes, just modified what was already there

shane
2018-06-18 19:27
yeah - that is a UX bug that we fixed - but I don't think the fix pushed from the `tip` UX portal to the `stable` portal :disappointed:

shane
2018-06-18 19:27
@zehicle pushed that fix a few weeks ago

shane
2018-06-18 19:27
if you switch to the `tip` portal (the beaker icon in upper right), you'll be using the updated Portal

shane
2018-06-18 19:28
but we haven't marked it "stable" yet - though it's been very stable and hasn't been changed in a few weeks

shane
2018-06-18 19:38
@ any luck ?

2018-06-18 19:42
Had to remove the machines and reboot. It was caching the domain name with the incorrect . at the end

2018-06-18 19:42
stage still show's discover though

shane
2018-06-18 19:43
you'll also need to make sure the `defaultWorkflow` prefs (in the quickstart)

2018-06-18 19:44
ah yes, that was missed

2018-06-18 19:44
setting now

2018-06-18 19:55
Looks good now, all 3 of my bare metal server's now booted into sledgehammer-wait

2018-06-18 19:56
@shane++

shane
2018-06-18 19:56
Woot woot!

2018-06-18 20:33
Is there a zsh completion for drpcli?

shane
2018-06-18 20:33
nope - sorry

shane
2018-06-18 20:33
just BASH - no idea if any of it's compatible ...

2018-06-18 20:34
nope, it throws a bad option error

2018-06-18 20:34
```/usr/local/etc/bash_completion.d/drpcli:type:10343: bad option: -t```

shane
2018-06-18 20:34
we're open for pull requests to add it ... :slightly_smiling_face:

2018-06-18 20:35
sounds good, thanks


2018-06-19 14:58
I just transferred the krib content package to my DRP instance

2018-06-19 14:59
both the krib-config and krib-install throw the invalid template name error

shane
2018-06-19 15:03
What version of KRIB did you pull in?

2018-06-19 15:04
v1.8.0

shane
2018-06-19 15:06
please update to the `tip` version in the contents menu

2018-06-19 15:15
ok, is there a way to just update the contents menu or do I need to update drp to tip?

shane
2018-06-19 15:17
In the Contents menu item -you should see "KRIB" in the center panel - there is a (very small) drop down arrow to the right of the "KRIB" name, if you click it, it should give you a pick list of versions

shane
2018-06-19 15:17
select "tip" then "upgrade"

2018-06-19 15:17
yep, I see it, thanks

shane
2018-06-19 17:40
@ ... any joy with the upgraded KRIB ?

2018-06-19 17:47
has joined #community201806

2018-06-19 17:48
Yep, the updated content did the trick

2018-06-19 17:48
ty

shane
2018-06-19 17:48
xclnt

vlowther
2018-06-19 17:58
There are already 3 of us on the meetup zoom. :slightly_smiling_face:

shane
2018-06-19 17:58
- community meetup zoom link: https://zoom.us/j/3403934274

2018-06-19 18:02
Is this just a general meetup, providing updates?

shane
2018-06-19 18:03
updates, demos, community discussion ...

vlowther
2018-06-19 18:03
We hold them every 2 weeks -- they are updates, feedback, showing off the latest shiny thnigs, etc.


romain.lafontaine
2018-06-19 19:29
Damn... I missed it... Waiting for the replay... :disappointed:

spector
2018-06-19 20:09
working on it right now

spector
2018-06-19 20:11
It will be at https://youtu.be/a_r8w2et3gU in about 5 minutes or so

2018-06-19 20:13
Does the `latest` docker container tag contain the v3.9 KRIB HA features?

shane
2018-06-19 20:16
Latest should be tracking what will become v3.9.0 - if you pull it, run `drpcli info get` - it should be a version tag like v3.8.2-tip-123

shane
2018-06-19 20:16
123 will be something else

2018-06-19 20:27
`v3.8.2-tip-189-2f5945fb1c39b08c76116da9971883e8a1da6e46`

2018-06-19 20:27
perfect, ty

2018-06-19 20:30
Is the KRIB-HA plugin available yet? or is that a later release?

zehicle
2018-06-19 20:41
@ we're still working on some aspects - I'll coordinate w/ @greg and see where we stand

2018-06-19 20:42
ok, no worries, ty

shane
2018-06-19 20:43
I'd guess another week for some of the rough edges to be sorted out, and then it'll be available in "MVP" form

shane
2018-06-19 20:43
but, a lot going on right now, so ... that may slip

zehicle
2018-06-20 00:12
we're in the process of moving the KRIB work into the DigitalRebar github repo (likely under the community-content path) and changing the license to APLv2.

zehicle
2018-06-20 00:13
I'll be starting that shortly. I'm evaluating w @greg how/when to port over the HA parts since they require the Cert Generator plugin (which is not open at the moment)

2018-06-20 09:47
has joined #community201806

2018-06-20 11:10
Hi guys! I just found your work, which is really cool, appreciate that ! I'm interested in DRP with Ansible and Dynamic Inventory built on top of DRP, I saw demos and showcases focused on Kubernetes and some notes in documentation for DRP V2. Do you have some hidden gems available for community? Thanks!

ctrees
2018-06-20 11:25
Welcome Shs

ctrees
2018-06-20 11:25
woops... attempting to get the robot to share some of the general info...


ctrees
2018-06-20 11:32
@shane is the community guru and oracle :wink:... but based on what you've just said... following the on-line demo's of the K8's you'll get a great dose of what DRP can do...

ctrees
2018-06-20 11:35
@spector just released the community meetup where @shane demo'd HA-Krib https://youtu.be/a_r8w2et3gU?t=1395

spector
2018-06-20 13:00
The $Welcome guide @

2018-06-20 13:00
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-06-20 13:32
@ - welcome indeed - we also have the Ansible content which provides basic/generic ansible inventory to any ansible playbook. It's used with our Ansible Kubespray kubernetes content as well.

shane
2018-06-20 13:32
@ctrees - thx :slightly_smiling_face: we choose the variable reference for our custom responses, like $faq


shane
2018-06-20 13:35
@ - you'll find some basic documentation on the ansible and kubespray pieces, in the Integrations section of our documentation: https://provision.readthedocs.io/en/tip/doc/integrations/ansible.html

zehicle
2018-06-20 14:25
There's a great KRIB v2 / HA demo in the back half of the community meeting! Thanks @shane KRIB UPDATE >>> @greg and I reviewed plans today around KRIB and have some exciting updates! We're going to be: 1) move the content over to the digitalrebar/community-content repo over the next few days (it will still be in the RackN SaaS catalog of course) 2) combining HA KRIB code and KRIB (likely, we'll just make HA the official v2) 3) changing the license to APLv2 4) building some specialized UX to support KRIB use from Profiles 5) we're also moving the code for VirtualBox IPMI and Packet IPMI into the open in a new digitalrebar/plugins repo (to be created) and updating those to be APLv2 also. I'm excited to move this work into a community license and I hope that people in the Digital Rebar community will help us refine and extend Kubernetes operations patterns for bare metal.

2018-06-21 03:37
has joined #community201806

2018-06-21 05:16
I've got some servers booting into the sledgehammer image, but they're not showing up as machines in the UI. Any ideas what might be going on? (I've got 60 machines and about 1/4 of them are booting but not showing up)

2018-06-21 05:17
Also, what's the default username and password for the sledgehammer image?

greg
2018-06-21 11:45
The access creds for sledgehammer are root rebar1.

greg
2018-06-21 11:46
With regard to the partial machines, what version of drp are you running?

greg
2018-06-21 11:46
From sledgehammer you can check journalctl -u sledgehammer

2018-06-21 12:15
Guys, thanks for hints above!

zehicle
2018-06-21 12:38
Hi @ $welcome

2018-06-21 12:38
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-06-21 15:53
@ if you are using DHCP on DRP, make sure your domain name does not have a trailing dot in it - that's a bug in subnet creation on UX side

shane
2018-06-21 15:55
You'll see an error for hostname in sledgehammer if you do `journalctl -u sledgehammer ` if that is the case

2018-06-21 17:32
@greg I'm running v3.8.1-tip-5-b47c1ec97c4712fbe12e8d4a4f2fa752a09a6f0b

greg
2018-06-21 17:34
How are your machines getting their IPs? I?m searching for if machines are getting names that start with numbers.

greg
2018-06-21 17:34
If so, you need to update to the latest tip DRP.

2018-06-21 17:42
Machines get their ips from dhcp provided by drp

greg
2018-06-21 17:44
hmm - okay. @ - what does `journalctl -u sledgehammer` show on the machines not getting registered?

2018-06-21 18:11
Found logs that were listing another dhcp server that the network guy forgot to shut off. That fixed the first node, so hopefully the rest will properly register now.

2018-06-21 18:20
I've been running into a bug where multiple ssh public keys get all concatenated onto one line. The fix appears to be to add an extra line in the template that generates them. Anyone else run into that?

2018-06-21 18:20
It makes it so that only the first key works.

shane
2018-06-21 18:21
What version of DRP community content are you running?

2018-06-21 20:37
has joined #community201806

2018-06-21 20:44
@ uploaded a file: https://rackn.slack.com/files/UBBSHF675/FBC06760J/terraformerror.png and commented: Hello, I have been trying to run the Terraform Provider for Digital Rebar Provision, and whenever I do "terraform apply" I get the following error. (Versions: DR v3.8.2 & UX latest)

2018-06-21 20:45
is this a bug in DRP?

greg
2018-06-21 20:51
what version of terraform are you using @?

greg
2018-06-21 20:52
Terraform arbitrarily may have changed their field name validation.

greg
2018-06-21 20:52
@ - can you send me (privately if desired) your plan file?

2018-06-21 20:53
Terraform v0.11.7

2018-06-21 20:53
simple plan

2018-06-21 20:53
provider "drp" { api_user = "rocketskates" api_password = "r0cketsk8ts" api_url = "https://127.0.0.1:8092" } resource "drp_machine" "onenode" { count = 1 Workflow = "ubuntu18" Description = "updated description" }

greg
2018-06-21 20:54
hmm - I?m using 0.11.3

greg
2018-06-21 20:54
though it is now yelling at me to update.

greg
2018-06-21 20:55
does `terraform plan` fail the same?

2018-06-21 20:55
yes, same error

greg
2018-06-21 20:55
cool

greg
2018-06-21 20:55
for me.

greg
2018-06-21 20:56
what version of drp?

2018-06-21 20:56
3.8.2

greg
2018-06-21 20:57
sorry - you told me that.

greg
2018-06-21 20:57
hmm - thinking?.

greg
2018-06-21 20:57
my terraform doesn?t hit this.

2018-06-21 20:58
do I need to do extra steps after the discovery?

2018-06-21 20:58
all my nodes are in sledgehammer-wait state

greg
2018-06-21 20:58
You do. You need to add the `terraform-ready` stage to the end of your discovery workflow.

greg
2018-06-21 20:59
This comes from the `terraform` content package

greg
2018-06-21 21:00
just upgraed to 11.7 and I don?t get the error.

greg
2018-06-21 21:00
I wonder if your env has TF vars defined?

greg
2018-06-21 21:00
or some strict plugin setting.

2018-06-21 21:01
let me look at the discovery workflow and get back to you

greg
2018-06-21 21:05
okay - cool

2018-06-21 21:06
I am not sure how to add the terraform-ready stage?

greg
2018-06-21 21:07
Do you ahve a discovery workflow?

2018-06-21 21:07
yes

greg
2018-06-21 21:07
Do you have the terraform-ready stage?

2018-06-21 21:07
no

greg
2018-06-21 21:07
you will need to get that from the content packages.

greg
2018-06-21 21:08
You may have to go into the catalog and enable it.

greg
2018-06-21 21:08
Through the SaaS portal login.

2018-06-21 21:08
got it

greg
2018-06-21 21:08
Cool

greg
2018-06-21 21:09
then drag it onto the end of the discovery workflow

2018-06-21 21:10
I will restart the discovery workflow and try terraform apply now

2018-06-21 21:19
are the nodes suppose to be in a terraform-ready stage ?

2018-06-21 21:19
before I apply my terraform plan

2018-06-21 21:19
?

zehicle
2018-06-21 21:20
They need to have the terraform params set



2018-06-21 21:23
these are the current parameters, before applying the plan

greg
2018-06-21 21:23
yeah - looks right

greg
2018-06-21 21:24
`terraform/managed` set to `true` means that terraform can touch the machine.

greg
2018-06-21 21:24
`terraform/allocated` set to `false` means that terraform hasn?t allocated the machine to a user yet.

greg
2018-06-21 21:24
It works like a big pool.

2018-06-21 21:25
ok, I still get the same error

greg
2018-06-21 21:25
well - machines are closer to being ready.

greg
2018-06-21 21:26
The error has to do with terraform trying to validate the plugin, but I don?t understand why your plugin and mine have different results.

greg
2018-06-21 21:26
Where did you get the plugin?


2018-06-21 21:27
I cloned

greg
2018-06-21 21:28
hmmm - okay

2018-06-21 21:28
then inside the directory I did "go get" then make build

2018-06-21 21:28
it didn't produce the binary so I ran scripts/build.sh

2018-06-21 21:29
then it built the binary



2018-06-21 21:29
ok I will try this one

greg
2018-06-21 21:30
Did you build from master?

greg
2018-06-21 21:30
there isn?t much different.

greg
2018-06-21 21:30
between the two at the moment.

greg
2018-06-21 21:31
My guess is that the vendoring is messed up in the tree

greg
2018-06-21 21:32
Need change the build process in the makefile to call the build tool.

2018-06-21 21:33
ok I am not getting the error anymore with the prebuilt plugin!

greg
2018-06-21 21:33
Yeah - the problem is probably vendor dependencies.

greg
2018-06-21 21:33
`go install` doesn?t do that right.

2018-06-21 21:34
ok everything works fine now!

2018-06-21 21:34
thank you very much for your time

greg
2018-06-21 21:34
np - I?ll fix the makefile shortly.

greg
2018-06-21 21:35
I?m in the middle of cutting releases and updating things so I?ll slip this in as well.

2018-06-22 14:21
has joined #community201806

greg
2018-06-22 18:04
- Release 3.9.0 is cut and out. Content packages are moved to v1.9.0 and plugins are at v2.3.0. Terraform plugin is at v1.1.1.

greg
2018-06-22 18:05
Amazon is running slowing still - The UX portal page is not updated yet, but should be in the next 24 hours. You can use http://tip.rackn.io to get latest UX.

greg
2018-06-22 18:06

spector
2018-06-22 18:07
@ $Welcome

2018-06-22 18:07
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

zehicle
2018-06-22 18:09
This release has A LOT of great functionality (and also a lot of code movement) including HA, RBAC, Multi-Tenant, and encrypted Params. Of course, bug fixes and performance too. It's worth reviewing the release notes carefully.

dave.parker
2018-06-22 20:14
Hey folks.

dave.parker
2018-06-22 20:15
Suddenly having a problem getting the UX to show machine details. When I click on a machine in the UX the page just hangs at loading forever. The chrome developer tools give me this as the call that's failing: ```{Model: "machines", Key: "", Type: "GET", Messages: ["Filter not found: slim"], Code: 406} Code : 406 Key : "" Messages : ["Filter not found: slim"] Model : "machines" Type : "GET"```

dave.parker
2018-06-22 20:16
Any thoughts? This apparently started happening today. It was still working for me but not anybody else on the team until we rebooted the rebar server, now I get the same problem.

shane
2018-06-22 20:17
Something shady with slim ...

shane
2018-06-22 20:18
@dave.parker - the `slim` feature is newly introduced by @vlowther and @greg to reduce the parameter amount pushed back in an API call

shane
2018-06-22 20:18
are you using "http://portal.rackn.io" or the "http://tip.rackn.io" UX ?

dave.parker
2018-06-22 20:19
Looks like portal

shane
2018-06-22 20:19
give it a shot on Tip (click on the "beaker" icon in upper right)

shane
2018-06-22 20:19
we're moving over the Tip UX to Portal (stable), but the backend AWS systems are nightmarishly slow

dave.parker
2018-06-22 20:20
Looks like I get the same error.

shane
2018-06-22 20:22
what version DRP are you running ?

greg
2018-06-22 20:24
Okay - The UX is supposed to not use slim with older DRPs. It is a bug in the UX.

dave.parker
2018-06-22 20:24
Versions: DR v3.8.0 & UX v1.3.0

greg
2018-06-22 20:24
I suspect.

greg
2018-06-22 20:24
Yeah - update to DR v3.9.0 and it will go away.

dave.parker
2018-06-22 20:24
Ok. So upgrade?

dave.parker
2018-06-22 20:24
Ok, thanks.

greg
2018-06-22 20:28
Hmm - Strange. It works for me. with something without the slim feature-flag.

greg
2018-06-22 20:28
but not as far back as 3.8.0

shane
2018-06-22 20:28
I got the error w/ v3.8.2-tip-189

shane
2018-06-22 20:29
(using `tip` UX)

greg
2018-06-22 20:31
hmm - v3.8.2-tip-184 with tip UX doesn?t for me.

greg
2018-06-22 20:34
details machine page. May not be going far enough.

greg
2018-06-22 20:34
just a second.

greg
2018-06-22 20:34
Okay - see it now

greg
2018-06-22 20:47
Fixed. New release out there and moving through the system.

shane
2018-06-22 23:07
@greg and @dave.parker - just confirming that the fix has rolled out to stable UX - and my testing shows it working well with the same test case that failed for me previously

zehicle
2018-06-23 21:20
The slim api feature just slipped into the release at the last minute... allows queries to omit params and meta data which can make models large. If you don't care about that data on a query then they will improve wire performance.

2018-06-24 18:17
Error: GET: bootenvs/sledgehammer: Not Found

2018-06-24 18:17
I cant load any of the boot environments, is the remote server up?

zehicle
2018-06-24 19:16
I'll check

zehicle
2018-06-24 19:16
but, your getting anything from the DRP API is local

zehicle
2018-06-24 19:16
what's the full URL?

shane
2018-06-24 19:18
@ the upload iso helper command works just fine for me - from the local Shell to my DRP instance: ```root@demo:~# drpcli bootenvs uploadiso sledgehammer { "Path": "sledgehammer-6122f34b46b5b74b668d6779e33f5fcd0f44a8cc.tar", "Size": 367605248 }```

shane
2018-06-24 19:18
if you are running `drpcli` from a remote machine (eg your laptop) - you have to remember to set the Endpoint address and port to connect to on your remote DRP instance - for example: `drpcli --endpoint https://192.168.8.1:8092 info get`

shane
2018-06-24 19:19
you can also use the environment variables, like so: `export RS_ENDPOINT=https://192.168.8.1:8092`

shane
2018-06-24 19:19
then: `drpcli info get` will reference the _RS_ENDPOINT_ variable appropriately

shane
2018-06-24 19:21
if you are unsure of where the Sledgehammer (for instance) BootEnv ISO comes from, you can check the BootEnv itself for `IsoUrl` location: ```root@demo:~# drpcli bootenvs show sledgehammer | jq '.OS.IsoUrl' "http://rackn-sledgehammer.s3-website-us-west-2.amazonaws.com/sledgehammer/6122f34b46b5b74b668d6779e33f5fcd0f44a8cc/sledgehammer-6122f34b46b5b74b668d6779e33f5fcd0f44a8cc.tar"```


shane
2018-06-24 19:22
this verifies the download location (OS.IsoUrl) is not the issue

2018-06-24 21:04
Thank you @shane I think it was the endpoint configuration

shane
2018-06-24 21:04
great - glad it's good now

2018-06-26 20:02
has joined #community201806

2018-06-26 22:22
I?m looking to buy the `os-other` license, how can I get started?

shane
2018-06-26 22:51
@ $welcome

2018-06-26 22:51
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-06-26 22:52
@ you should be receiving an email from us soon, if you haven't already

2018-06-26 22:56
Rob just DMed me - I think I?m good for now - thanks

2018-06-26 23:05
@shane how do i get at the os-other in the trial?

shane
2018-06-26 23:16
You should see it in the UX under the "Content Packages" menu entry - if not - check the "Browser For More Content". You also MUST select the Organization that was created for you (that provides your entitlement rights for licensing) in the upper Left (blue button)

shane
2018-06-26 23:17
@ ^^^

2018-06-26 23:17
Gotcha - thanks

shane
2018-06-26 23:18
let me know if you have any issues with that

2018-06-26 23:19
sure thing

2018-06-27 16:36
has joined #community201806

2018-06-27 17:03
so I have the same error, running from local machine this time

2018-06-27 17:04
and I didn't change the endpoint configuration

shane
2018-06-27 17:05
@ $welcome

2018-06-27 17:05
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-06-27 17:05
I am having an experimental setup on VBox

shane
2018-06-27 17:06
trying to find our original conversation in community - I hate slack threading :disappointed:

2018-06-27 17:06
Error: GET: bootenvs/sledgehammer: Not Found

2018-06-27 17:06
this error when I try: "drpcli bootenvs uploadiso sledgehammer"

shane
2018-06-27 17:10
does `drpcli info get` return information on the DRP Endpoint ?

2018-06-27 17:12
"address": "127.0.0.1", "api_port": 8092, "arch": "amd64", "binl_enabled": true, "binl_port": 4011, "dhcp_enabled": true, "dhcp_port": 67,

shane
2018-06-27 17:12
cool - what about: `drpcli contents show drp-community-content | jq '.meta.Name'`

2018-06-27 17:13
Error: GET: contents/drp-community-content: No such content store

shane
2018-06-27 17:13
ah - yes - you have to install the Community Content pack to be able to use it :slightly_smiling_face:

shane
2018-06-27 17:14
by default, we install Community Content via the installer, unless you specify `--nocontent` during install time

shane
2018-06-27 17:14
the Community Content pack contains all of the pieces and parts to define how to do installations for basic KickStart/Preseeds

shane
2018-06-27 17:15
Are you using the Web Portal ?

2018-06-27 17:15
yes

2018-06-27 17:15
I just transferred the package

2018-06-27 17:15
I think its working now

shane
2018-06-27 17:15
ta da !

2018-06-27 17:15
but I don't remember doing that before

2018-06-27 17:16
thanks, sorry if my questions are a bit silly

shane
2018-06-27 17:16
no worries - new learning curve

2018-06-27 17:17
thank you

2018-06-27 18:27
Hi @shane, thank you

2018-06-28 21:26
has joined #community201806

2018-06-28 22:29
Hi, is there a tutorial to create a new workflow that installs ESXi? I have done the following: 1-upload a boot iso. 2-created a template for kickstart.cfg to automate installation. 3-created a new boot environment. (not sure what templates to select here, I selected my kickstart template)

2018-06-28 22:30
4-created a stage to install ESXi. 5-created a workflow for the installation, with previous stage then complete stage. when ever I execute the work flow, the machine reboots and fails to boot with my ESXi iso. is there a generic template to select in any boot environment that would let the machine just boot from the iso?

spector
2018-06-28 22:30
@ $Welcome

2018-06-28 22:30
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-06-28 22:47
@ - have you pulled in the `os-other` content package in your DRP Endpoint? It contains ESXi installer content

2018-06-28 23:53
@shane would you happen to know why the blue login button is grey'd out? I am trying to get familiar setting up an esx boot environment but to run bulk actions, and a variety of other actions require being logged into rackN.. any suggestions are welcome

2018-06-28 23:56
hmm.. seems that button isn't a link to the rackN login, but each of the other tabs cause the main area of the page to show a "login" button, which is so I guess that answers the question

shane
2018-06-28 23:57
hmm - ok - were you referring to the upper right button in the corner ?

2018-06-28 23:57
yea.. and the buttons that show as locked..

2018-06-28 23:58
even though I'm logged in..

shane
2018-06-28 23:58
sometimes ... the UX is stubborn about deciding if you're logged in, and a page refresh makes it happier

2018-06-28 23:59
Overview, Bulk Actions, and Switches.. each of those have a lock next to them.. and even though the top left button shows the organization i am using.. the login is displayed in the main area of the page

shane
2018-06-28 23:59
did you log in to your DRP endpoint ?

shane
2018-06-28 23:59
there are 2 logins - one to the Portal for access/use of the portal and the Automation Library

shane
2018-06-29 00:00
the other to authenticate to your DRP Endpoint API services to be able to "control" it

shane
2018-06-29 00:00
the one in the center of the screen should be your DRP endpoint

2018-06-29 00:03
yea, just logged out, then logged in again....

2018-06-29 00:05
when hovering over the "login" button in the center of the screen.. the link that is displayed in the lower left of the browser is: https://portal.rackn.io/#/user/login seems odd because at the top left is shows the organization I selected when I logged in..

shane
2018-06-29 00:09
we'll look at the hover over help - but the center log in is the DRP auth credentials for authenticating to the Endpoint

shane
2018-06-29 00:10
sorry - I'm stepping out for a family function - will be back in a few hours

shane
2018-06-29 00:10
it's been a 13 hour day so far ...

2018-06-29 00:12
no worries, thank you for your help

2018-06-29 00:14
ok.. here we go.. the blue button at the top left of the page... I have 2 options there.. my account.. and the organization

2018-06-29 00:15
when I choose my account.. those locked buttons become unlocked... but when i choose the organization then they lock

2018-06-29 00:26
question about "contents": is there a way to delete custom contents with a bad character in the name? I can list all contents but when I try to delete by id (name with the bad character, in my case "\n") it doesn't find the content

2018-06-29 00:27
didn't work through UI either... but figured it out

2018-06-29 00:28
converted "\n" to ASCII

2018-06-29 00:29
```drpcli contents destroy 'custom%0A'```

shane
2018-06-29 00:39
@ - nice solve :slightly_smiling_face:

shane
2018-06-29 00:40
@ - I don't have an answer for you right now - I took a look at the back-end Org stuff to make sure you entitlements are right, and they look good to me ... have you tried to do a full hard-refresh of your browser session to the Portal ? that may help

2018-06-29 16:09
yes I did, do I need to get the iso from drp community as well, or can I use mine?

2018-06-29 16:39
also whenever I try to clone or create a new template I keep getting this error: Templates[3]: No common template for centos-6.ks.tmpl

2018-06-29 16:39
I am not even trying to create a centos template

greg
2018-06-30 05:31
- NOTE - TIP digitalrebar content will require a sledgehammer update. New sledgehammer supports NFS if needed.