2018-08-01 10:15
Quick confirmation Q - if I need to reboot a machine (after e.g. a `yum update` which upgrades the kernel) but it *doesn?t* require a boot env change, is sourcing `./helper` and using `exit_reboot` in a task the correct approach? Couldn?t find anything else solid in doco or source..

greg
2018-08-01 12:37
Yes - that is the proper technique.

greg
2018-08-01 12:39
@ - sorry, forgot your name to notify. With tip DRP, you should be able to Hard reboot out from under the runner, but the job will show up as failed and you must make the task idempotent to keep from going through reboot loops.

greg
2018-08-01 12:40
If you need to rerun that task, you can use `exit_incomplete_reboot`. That will mark the job incomplete, reboot the boot, and when it comes back up it will restart at that task.

2018-08-01 12:48
@greg thanks, makes sense :slightly_smiling_face:

greg
2018-08-01 13:07
yikes - something makes sense. Quick change the product. :slightly_smiling_face:

2018-08-01 13:53
well i?ve managed to implement our previous foreman-based pxeboot setup in about a day using drp which took aaaages with foreman (and a ton of swearing) and this had no swearing, so please don?t change it now :smile:

vlowther
2018-08-01 14:14
hm... sounds like things are getting too easy. Time to switch everything to use our own custom configuration and task language.

vlowther
2018-08-01 14:14
Based on INTERCAL.

2018-08-01 14:36
:face_with_raised_eyebrow:

vlowther
2018-08-01 15:11
:wink:

zehicle
2018-08-01 20:58
sneak peek of a new screen coming in the UX that allows you to analyze workflow job times

2018-08-01 21:53
has joined #community201808

shane
2018-08-01 22:43
@ $welcome

2018-08-01 22:43
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-08-02 01:55
Hi, so far nothing tried but want to make sure that my idea will work at all: We plan to buy 10 rack server for a new project, the actual split: 1xProvisioning Server(DRP), 3x Kubernetes Master, 6x Kubernetes Worker As far as I understood that any hardware is ok as long ipmitool is working, are there any recommendations? I've seen a "dell-support" content pack, any benefit getting Dell Server?

2018-08-02 01:55
Time to feed the :bear:!

vlowther
2018-08-02 01:58
Tested firmware/RAID/BIOS config and update management.

vlowther
2018-08-02 01:59
We keep a couple of PowerEdge T320s in the office for testing purposes.

2018-08-02 02:03
sounds good

shane
2018-08-02 04:03
@ - sounds good - for the provisioning server at that scale - you don't need to dedicate a physical machine - if you have a machine you can drop a VM on for Digital Rebar Provision - you should be fine .... or, just don't spent much more than $1500 to $2500 tops on a rack server for the DRP service - unless you have some other use case for the DRP system

shane
2018-08-02 04:04
As @vlowther says - we best support Dell servers if you want to manage the BMC (iDRAC) and RAID hardware - but we can manage power cycle of any machine that can speak IPMI protocol too ... so basic power management is covered by anything with a BMC that speaks IPMI

2018-08-02 04:04
Time to feed the :bear:!

2018-08-02 04:12
I require an own provisioning server(which I re-use for everything I can't use in the Kubernetes Cluster, which I hope won't be much) since this will be a small island solution packed in a micro datacenter with no dependencies,. So we don't have anything else to run the DRP :slightly_smiling_face: scenario: ? one microdatacenter cooling & UPS all inside ? a bit networking hardware so that we can accept client connections and communicate to the internet (thinking of using metallb for external connectivity to the kubernetes cluster) ? 10 rack server (with the mentioned split) the only external hardware will be just some wireless access points

2018-08-02 16:59
has joined #community201808

2018-08-02 17:27
Hi all, I?m trying to import deployed machines in to Ansible inventory grid, inspired by kubespray guide, but failing on it with default message about No Ansible Profiles Defined in Ansible page. I added list of machines to ansible/groups-members with following syntax: "ansible/groups-members": { "server": [ "dd2-51-2c-d0-c6-b4.home.lab" ], "server2": [ "d0e-e8-1d-ff-d3-a3.home.lab" ], "node1": [ "d96-97-ce-d8-38-85.home.lab" ] }, I guess due syntax error I cannot save them. What is exactly wrong? Adding couple of picture to better describe my testing env.

greg
2018-08-02 17:42
You may have to use the CLI for the moment to set some of the vars. We?ll need to see what @zehicle says about it. May be a little while.

shane
2018-08-02 17:43
also - you may try setting the name of your profile to NOT include spaces - not sure if all of the places escape the profile correctly

2018-08-02 17:58
has joined #community201808

zehicle
2018-08-02 20:20
@ all of the ansible inventory generation code is in the inventory.py script. You should be able to look at that to see how the params are used. The ansible UX for those parameters may be pretty sensitive about how they are formatted. So use the regular profile editor as a fall back.

zehicle
2018-08-02 20:21
the good news is that it's JUST parameters feeding the inventory

shane
2018-08-02 20:21
@ $welcome :slightly_smiling_face:

2018-08-02 20:21
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-08-02 22:20
hi, I am trying to install DRP and got this error:

2018-08-02 22:21
I am trying to install in one of our locations in Germany, is that an issue?


greg
2018-08-02 22:22
That is the OLD OLD project. You need https://github.com/digitalreabar/provision

2018-08-02 22:23
oh, I thought the documentation online is updatedc

2018-08-02 22:23
ok thanks

shane
2018-08-02 22:23
we've tried to stamp out references to the OLD OLD version, but we haven't been 100% successful :disappointed:

2018-08-02 22:29
also is it ok to use Centos7?

shane
2018-08-02 22:30
that's pretty much our go-to OS we use here ... so ... yeah

2018-08-02 22:30
great!

greg
2018-08-03 01:59
- DRP Release v3.10.0 and Content Packages Release v1.10.0 and Plugin Release v2.4.0 are out! https://github.com/digitalrebar/provision/releases/tag/v3.10.0

zehicle
2018-08-03 01:59
:heart:

shane
2018-08-03 02:33
^^^^^ Some great stuff in v3.10.0 release - one of the super cool features, we've added Sprig based Golang Template functions to our Template Rendering engines. This means you now get access to some 100+ utility functions you can use when building and rendering templates. Check out the list: http://masterminds.github.io/sprig/

zehicle
2018-08-03 02:51
My favorite feature is the workflow analysis page

2018-08-03 11:14
Hmm. I appear to have managed to ?break? dhcp (not really) by having multiple vlans on client and drp server. One of the interfaces on the client has a reservation and because the vlan interfaces (e.g. `bond0.1010`, `bond0.1011` etc) get the same MAC address, they blindly get given the IP from the reservation - all vlans end up with the same IP. I thought fixing that might be as easy as adding a new Strategy that uses ifname + mac but Strategies only get passed the DHCP packet and have no knowledge of the interface :white_frowning_face:

vlowther
2018-08-03 12:21
Hm, I will have to think about that one.

vlowther
2018-08-03 12:21
Can you describe your network layout in more detail?

2018-08-03 12:38
Yeah so 4 VLANs - `boot (10.50.1.0/24)`, `storage (10.50.2.0/24)`, `migrate (10.50.3.0/24)`, `virt (10.50.4.0/24)`. Each Client and the DRP server has a LAG `bond0` to ToR switch. Each Client has a `p2p2` interface which is untagged on the `boot` VLAN. DRP server has interfaces `bond0.boot (10.50.1.254)`, `bond0.storage (10.50.2.254)`, `bond0.migrate (10.50.3.254)`, `virt (10.50.4.254)` Client servers have interfaces `p2p2 (DHCP - boot VLAN untagged)`, `bond0.storage (DHCP - storage VLAN tagged)`? (and the same for virt and migrate)

2018-08-03 12:39
the clients are configured to PXE boot off p2p2, install C7, and have bond0 with those vlan interfaces configured (storage, virt, migrate)

2018-08-03 12:41
they reboot, and get a dynamic lease from each interface - e.g. first server has `bond0.storage: 10.50.2.10`, `bond0.migrate: 10.50.3.10`, `bond0.virt: 10.50.4.10`

2018-08-03 12:41
If I then go and create a reservation for the 10.50.2.10 lease for that server, and then reboot it, all `bond0.<vlan>` interfaces will get a 10.50.2.10 IP

2018-08-03 12:45
this is because the reservation token is just the MAC address - and linux VLAN interfaces use the MAC address of the parent interface

2018-08-03 12:45
which means the server has the same MAC on every VLAN interface

2018-08-03 12:49
if i leave them all dynamic with no reservations, everything works fine

greg
2018-08-03 13:08
hmmm - so - I understand what is going on. We?ll have to decide if changing the behavior will break lots of people or not. I suspect not, but need to think about it.

2018-08-03 13:22
helpful if i submit a github issue with the above?

vlowther
2018-08-03 14:01
hm...

vlowther
2018-08-03 14:01
the trick will be in deciding what we should use for a unique ID.

vlowther
2018-08-03 14:02
The DUID is out for several reasons.

vlowther
2018-08-03 14:03
I would prefer that the MAC address stayed globally unique, but...

vlowther
2018-08-03 14:06
I prefer to not rely on server-side interface names, because they can change, and it will do Horrible Things to future DHCP failover strategies.

2018-08-03 14:12
local interface IP maybe?

2018-08-03 14:12
(on the server-side)

2018-08-03 14:17
cos thats what ?subnets? are anyway right, they just associate a listen address with a DHCP range

vlowther
2018-08-03 14:18
Yeah, that is what I have been leaning towards.

2018-08-03 14:19
is there ever a situation where a lease that was given out on one interface (subnet?) needs to be given out on another?

2018-08-03 14:19
i dont know enough about DHCP to answer that :smile:

vlowther
2018-08-03 14:19
We pretty much never deal in terms of interfaces beyond the low-level packet processing loop.

vlowther
2018-08-03 14:20
because they can be added, removed, updated, etc. on the fly.

2018-08-03 14:20
yeah as its so liable to change

2018-08-03 14:20
every time i restart networking my vlan subinterfaces get new interface indexes, proving that point

vlowther
2018-08-03 14:33
hm.

greg
2018-08-03 14:37
Could we track by subnet (either subnet IP from interface or giaddr field) and Mac? The non-reservation bond path works, so we are getting the subnet parts right. It breaks our assumption about MAC is king.

vlowther
2018-08-03 15:02
Yeah, the issue @ is running into is that reservations as currently defined are a global thing -- they say "mac foo gets IP bar", not "mac foo in subnet bar gets IP baz"

vlowther
2018-08-03 15:03
I can extend them to cover that case, the question is what is the best approach.

vlowther
2018-08-03 15:06
I just don't want to break the use case of being able to assign an IP to a machine even through it is not in any of our subnet definitions.

vlowther
2018-08-03 15:11
The short-term workaround is to not create reservations for this particular usecase

vlowther
2018-08-03 15:12
and try not to run out of IP addresses in the subnets in question. :slightly_smiling_face:

vlowther
2018-08-03 15:12
and don't purge the expired leases.

2018-08-03 15:14
i mean in practice what we?d usually do is allocate the IP by DHCP on first run and then set it static on the interface

2018-08-03 15:14
but i didn?t do that due to complexity and wanting to prove out something else

2018-08-03 15:14
and ran into this instead :smile:

vlowther
2018-08-03 15:14
Yeah, and that is a reasonable practice.

vlowther
2018-08-03 15:15
and as long as you don't run out of IPs and don't purge the leases, that will continue to work with us as well.

vlowther
2018-08-03 15:17
ok, so reservations will grow a Scoped flag to deal with this use case.

vlowther
2018-08-03 15:19
I will need to spend a bit of time fleshing this stuff out.

greg
2018-08-03 15:48
? I forgot. Sledgehammer was updated in this release. Please update that iso if you update to v1.4.0 community content.

2018-08-03 16:32
@ has left the channel

florent.wagener
2018-08-03 17:55
hey guys, is it possible to use wildcards when selecting members of an object type in a tenant?

greg
2018-08-03 17:58
yes, I believe *. Not regex

greg
2018-08-03 17:58
Interesting feature request.

florent.wagener
2018-08-03 18:04
My interest is to select some machines that have similar names

florent.wagener
2018-08-03 18:04
I tried with * but it doesn't seem to work

greg
2018-08-03 18:05
yeah - `'*' or exact match. Not "fred*"`

greg
2018-08-03 18:06
should work. You can add each machine name, I think. ?fred,greg,florent,victor,??

florent.wagener
2018-08-03 18:07
ok

florent.wagener
2018-08-03 18:07
I tried `'fred*'` that's why ^^

florent.wagener
2018-08-03 18:08
Otherwise I am currently testing RBAC.

florent.wagener
2018-08-03 18:09
I have trouble with the claims. What exactly are the keywords for the scopes and actions ?

florent.wagener
2018-08-03 18:09
no suggestion is shown in the menu when I edit a role, but it is shown when creating a new one.

greg
2018-08-03 18:10
`drpcli info get | jq .scopes`

greg
2018-08-03 18:10
vi

florent.wagener
2018-08-03 18:13
yep, I confirm, editing a role in the UX isn't working.

greg
2018-08-03 18:14
Bug what you tried, please.

florent.wagener
2018-08-03 18:45
1. Create a new role 2. Set up the claims of the new role 3. Save it. 4. Go back to editing the previously saved role 5. Try to modify the claims or anything won't work.

florent.wagener
2018-08-03 18:49
editing with drpcli works: `drpcli roles update prod-role '"Claims": [{"action": "list", "scope": "machines"}]'` showed no issue.

zehicle
2018-08-03 20:52
@florent.wagener thanks, that helps. I'll take a look

florent.wagener
2018-08-03 23:20
I think my UX is unstable since the upgrade

zehicle
2018-08-03 23:27
You may need to accept the certificate from the endpoint.

2018-08-06 17:00
has joined #community201808

florent.wagener
2018-08-06 17:08
I've discovered that this bug can be worked around by doing a hard refresh too.

shane
2018-08-06 17:08
@ $welcome

2018-08-06 17:08
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-08-06 17:10
@florent.wagener - you should be able to force accept the new TLS certificate, by going directly to the DRP Endpoint API port - you're web browser will then let you accept the self-signed certificate and redirect you to the stable portal. (example: https://10.10.10.10:8092 )

vlowther
2018-08-06 17:21
@ https://github.com/digitalrebar/provision/pull/963 has the reservation scopes change that should address the issues you are having.

vlowther
2018-08-06 17:21
Still a partial WIP, but it should give you an idea of where I am going with the concept.

florent.wagener
2018-08-06 19:13
hey guys, quick question, if I want to limit the usage of the IPMI plugin to specific actions, I guess I should use the `specific` field for that claim, which can only be specified through the command line right ?

greg
2018-08-06 19:15
yes - that is right.

florent.wagener
2018-08-06 19:32
so if I want to limit the IPMI plugins to only poweroff for example I should use a command like this: ``` drpcli roles update prod-role '"Claims": [{"action": "get, list", "scope": "plugins", "specific": "poweroff"}]' ``` Unfortunately, the specific isn't showing in the UX, maybe I am not using the right keyword ?

florent.wagener
2018-08-06 19:33
but it's showing in command line: ```drpcli roles show prod-role { "Available": true, "Claims": [ { "action": "get, list", "scope": "plugins", "specific": "poweroff" } ], ```

greg
2018-08-06 19:33
sorry - it would be action

greg
2018-08-06 19:34
wait!

greg
2018-08-06 19:34
let me think.

greg
2018-08-06 19:34
ipmi actions are scoped like:

greg
2018-08-06 19:34
scope = machines

greg
2018-08-06 19:35
actions = ipmi power commands - ?poweroff,poweron?

greg
2018-08-06 19:35
specific = ?UUID?

florent.wagener
2018-08-06 19:35
mmm

florent.wagener
2018-08-06 19:36
let me try

florent.wagener
2018-08-06 19:38
the UUID is the UUID of a machine right ?

greg
2018-08-06 19:40
yes or *

florent.wagener
2018-08-06 19:41
ok

florent.wagener
2018-08-06 19:45
so something like this should work: ``` { "actions": "poweroff, poweron" "scope": "machines", "specific": "*" }, ```

greg
2018-08-06 19:46
I think so

florent.wagener
2018-08-06 19:50
This isn't accepted: ``` { "action": "poweron, poweroff", "scope": "machines", "specific": "*" }, ``` ```Error: ValidationError: roles/prod-role No such action 'poweroff' No such action 'poweron' ``` This is accepted but doesn't do what I want: ``` { "actions": "poweron, poweroff", "scope": "machines", "specific": "*" }, ``` In fact it's transformed into something completely different when I do `drpcli roles show prod-role`: ``` { "action": "get,list", "scope": "machines", "specific": "*" }, ```

greg
2018-08-06 19:55
action is defaulted. Let me look.

greg
2018-08-06 19:56
action is the correct key.

florent.wagener
2018-08-06 19:58
in UX when I want to edit a claim, under the action tab, I have the possibility to also select `action` and/or `actions`, isn't that redundant?

florent.wagener
2018-08-06 20:00
this is weird :slightly_smiling_face:

greg
2018-08-06 20:01
that does seem weird. hmmm

greg
2018-08-06 20:02
I think this is a bug. We aren?t pulling in the plugin provide actions into claims validation it looks like.

florent.wagener
2018-08-06 20:02
ok :slightly_smiling_face:

florent.wagener
2018-08-06 20:03
so my guess is for now I can't limit the plugins actions in a user role :slightly_smiling_face:

greg
2018-08-06 20:04
yeah - we intended to. Need to figure how to wire it. It is actually easy, but have to break some validation.

greg
2018-08-06 20:04
maybe - let me look a little more

florent.wagener
2018-08-06 20:04
of course :slightly_smiling_face:

greg
2018-08-06 20:05
actually, I think there is a way. just a second.

greg
2018-08-06 20:07
Try this: `"action": "action:poweron, action:poweroff"`

greg
2018-08-06 20:07
actions and action are valid as well because there are cli opitons to list the actions or action info on a machine.

greg
2018-08-06 20:08
So you can actually restrict the access to the list actions request on a machine.

greg
2018-08-06 20:08
or the show action <action> on a machine.

greg
2018-08-06 20:08
@florent.wagener - I think there is a way. the `action:<cmd>` form should work

florent.wagener
2018-08-06 20:15
let me try

florent.wagener
2018-08-06 20:15
yes it's working !

greg
2018-08-06 20:17
okay - just forgot the syntax. I knew Victor had implemented. sorry for taking so long to pull it out of my mind.

florent.wagener
2018-08-06 20:18
No worries @greg

2018-08-06 20:54
@vlowther just saw your message from earlier, looks good :slightly_smiling_face: I?ll build and have a play tomorrow if I get a chance :+1:

vlowther
2018-08-06 20:55
ok, coolio.

greg
2018-08-06 21:22
It should be in tip by morning US time

2018-08-07 11:17
Hi All! Is it supported paravirtualized disk drive provided by KVM host? With ubuntu 18.04 installation is failing on missing disk, without virtio controller is deployment finished without error...

vlowther
2018-08-07 13:15
Yes, but you have to change the boot disk to `vda` using the `operating-system-disk` parameter.

vlowther
2018-08-07 13:15
since disks hanging off a virtio disk controller are not scsi devices.

zehicle
2018-08-07 15:46
BUG ALERT: we've heard of an issue with the UX not connecting to endpoints recently. We have duplicated this issue and are working on a fix. If you would like updates about this issue, please subscribe to https://github.com/digitalrebar/provision/issues/965

zehicle
2018-08-07 16:13
I've posted a workaround for the bug in the bug histlory

romain.lafontaine
2018-08-07 18:06
The workaround doesn't work on my side :disappointed: waiting for the fix

zehicle
2018-08-07 19:04
We are testing a fix now.

zehicle
2018-08-07 19:04
you may be able to workaround by starting with URL/system

greg
2018-08-07 19:15
Okay - releases are published. tip,test,stable, and v1.4.1 are updated in the tree.

greg
2018-08-07 19:16
It will take up to 24 hours to filter through Amazon.

greg
2018-08-07 19:16
-^

2018-08-07 22:23
has joined #community201808

2018-08-08 19:27
Hi, I have a few questions regarding IPMI configuration for a large number of servers: if I have 50 servers discovered and has obtained an IP address through DHCP, what is an efficient way to configure IPMI/address, username and password for them. (assuming they are IPMI preconfigured)?

2018-08-08 19:28
this might be helpful, they all share the same user and passwd

2018-08-08 19:35
actually, this wont work, I will not be able to match obtained DHCP IPs with OOB IPs

vlowther
2018-08-08 19:35
Make a profile, configure it with the following params:

florent.wagener
2018-08-08 19:35
DHCP on the OOB VLAN might be an idea ?

2018-08-08 19:36
I am not allowed to modify current IPMI configurations

shane
2018-08-08 19:36
@ - if all user/pass same - you can set those as params in a profile, apply those to the machines - or globally in `global` profile

vlowther
2018-08-08 19:37
`{"ipmi/username":"your-username","ipmi/password":"your-password","ipmi/configure/user":false,"ipmi/configure/network":false}`

shane
2018-08-08 19:37
the IP addr for the BMC you might want to write small stage/task/template that you add to workflow, which uses "ipmitool" to grab the BMC IP address, and add it as a Param directly on the Machine

shane
2018-08-08 19:37
you can check the http://github.com/digitalrebar/provision-content - "KRIB" content templates for some patterns on how to write back Params on the Machine object

vlowther
2018-08-08 19:38
with that, the `ipmi-configure` task will not configure anything, but it should record the current IP address of the BMC.

2018-08-08 19:46
@vlowther so if I add the profile with : {"ipmi/username":"your-username","ipmi/password":"your-password","ipmi/configure/user":false,"ipmi/configure/network":false} to the machines, I would be able to control them without having to set the BMC IP addresses for every machine ?

vlowther
2018-08-08 19:47
After running them through a workflow with the `ipmi-configure` task, yes.

2018-08-08 19:47
ok, I will run few tests and see how it goes.

zehicle
2018-08-09 00:50
With 3.10, we improved the job timing calculation so that we can build some interesting analysis graphs. We have one that shows workflow run times. We've got another coming that compares task runs:

2018-08-09 07:48
has joined #community201808

romain.lafontaine
2018-08-09 12:51
Works like a charm, thx !

zehicle
2018-08-09 14:08
@ $welcome

2018-08-09 14:08
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

cameron.esdaile
2018-08-09 18:01
guys any notes / experience of running provision on docker. I see there is a regular push to docker hub offical repo but wondering if there are any gotchas in terms of servicing dhcp traffic etc.

shane
2018-08-09 18:03
@cameron.esdaile - we do push to docker hub like you noticed, we have a number of people using it - and our Dockerfile has some details in it that may help with understanding the EXPOSE, etc around DHCP: https://github.com/digitalrebar/provision/blob/master/Dockerfile

cameron.esdaile
2018-08-09 18:09
@shane thanks for the pointer to the Dockerfile. I am more interested in the practical docker run command and I am assuming a --net=host flag would be needed to harvest the dhcp request off the host wire.

shane
2018-08-09 18:54
sure - we've seen usage in the field as follows: `docker run -itd --name drp --net host digitalrebar/provision:stable`

shane
2018-08-09 19:17
@cameron.esdaile sorry ... forgot to tag you ^^^^

cameron.esdaile
2018-08-09 20:54
@shane thanks for confirming the --net host

zehicle
2018-08-09 22:52
KUBERNETES update - the latest tip for KRIB automatically installs Helm and can install charts for you.

zehicle
2018-08-09 22:52

zehicle
2018-08-09 23:05
NOTICE ABOUT UX LICENSE ENFORCEMENT COMING.... We have added code into the RackN UX for the next release (it's already in http://test.rackn.io) that by default limits free accounts to 100 machines and DRP v3.8 or above. For RackN licensed or trial customers, these values are already set to reflect your current license terms. The UX will continue to support larger machine counts and older versions for licensed customers. We are looking for feedback and concerns about this change which will go fully in effect with the v3.11 release.

2018-08-10 09:27
how does that work, is that 100 machines per DRP endpoint? or if I had 4 endpoints with 25 machines each would that be at the limit?

zehicle
2018-08-10 11:41
Per endpoint

2018-08-10 16:16
is there any impact to the DRP backend API or is the scope purely for the UX?

zehicle
2018-08-10 16:17
@ no. this is UX only

2018-08-10 22:43
has joined #community201808

greg
2018-08-11 23:11
- Released the community content to pick up debian release updates.

shane
2018-08-11 23:22
@ $welcome

2018-08-11 23:22
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

shane
2018-08-14 14:33
Meetup today at 11am PST ... we hope to see all of you there! Topics: ? UX / Portal enhancements (Workflow Timings, Template Rendering, various easter eggs) ? AUDITING: demo of enhancements on what triggered events and who caused them to trigger ? OS: Mac OS BSDP support for NetBoot/NetInstall ? CLASSIFY: demo of Classification engine and auto workflow Meetup info: https://www.meetup.com/digitalrebar/events/253623766

2018-08-14 16:31
Hi, I am running ESXi install workflow through terraform, everything is going well. However, after the installation successfully complete, terraform never stops and does not report that the installation is done. what stage do I need to add to ESXi installation workflow to let terraform report that its done?

shane
2018-08-14 16:41
@ what does your workflow currently look like? (`drpcli workflows show WORKFLOW | jq '.Stages'`)

2018-08-14 16:45
"esxi-6.5-install", "complete"

greg
2018-08-14 17:05
You need to alter you esxi kickstart to set the stage to complete

greg
2018-08-14 17:06
We should do that in the tree. Really.

2018-08-14 17:24
is there an example on how to set the stage in kickstart file?

shane
2018-08-14 17:29
ESXi kickstart %pre necessary to mark Machine complete in workflow.

shane
2018-08-14 17:29
@ ^^^^

shane
2018-08-14 17:29
we need to add that to the DRP based kickstart ...

2018-08-14 17:39
another thing, I am trying to run a workflow for large number of machines at once, the process seems to fail with large number ( > 5) of machines during PXE boot, it shows error reading files. any suggestions on how to approach and investigate this issue. I tried using bulk actions tab, as well as terraform provider.

zehicle
2018-08-14 18:01
Turn up logging level.

shane
2018-08-14 18:27
@ - fyi - we have tested 100s of concurrent machine provisioning activities successfully with DRP - the first thing to do is verify that you account for any external infrastructure that might be causing issues - http mirrors, DHCP services, DNS services, etc...

shane
2018-08-14 18:28
then take a look at the Workflow a given machine was running, and you can see task timings at the bottom of the workflow - this may help understand where the failures are

2018-08-15 07:23
the docker for digitalrebar/provision fails to build..i've created a PR


zehicle
2018-08-15 12:10
@ thanks for the pull! It's been merged please let us know if that resolved the docker builds

2018-08-15 14:08
@zehicle sure. working now on a rpi version of the Dockerfile

2018-08-15 14:27
by the way...is there any sample on how to setup a k8s cluster with rebar?

shane
2018-08-15 14:27
yes - our "KRIB" content does it via the kubeadm process

shane
2018-08-15 14:28
or if your a masochist, you can do it with our Ansible/Kubespray integration as well

2018-08-15 14:28
kubespray is the way i was thinking of

2018-08-15 14:28
can you point me to the sourcecode?


2018-08-15 14:28
awesome! thank you

shane
2018-08-15 14:28
see the "KRIB" and the "Kubespray" content packages

shane
2018-08-15 14:29
KRIB is under very active development and gaining lots of capabilities and features currently

2018-08-15 14:29
excellent

2018-08-15 14:29
nice

shane
2018-08-15 14:29
there are a number of KRIB youtube videos as well - some of them are referenced in the documentation

2018-08-15 14:29
anything for setting up ceph ?

shane
2018-08-15 14:30
nope - we don't have a Ceph content pack ... but it could be pretty easily built ... if you had an existing Ansible playbook, our Ansible content should be able to handle it with minimal fuss

2018-08-15 14:30
perfect

2018-08-15 14:30
thank you

2018-08-15 14:31
does krib uses ansible?

shane
2018-08-15 14:31
nope

shane
2018-08-15 14:32
it uses `kubeadm` patterns to build masters and join workers to the cluster - with dynamic master election (if desired)

2018-08-15 14:32
i'm looking more into multi-master cluster

shane
2018-08-15 14:32
it's a LOT faster than Ansible - around 5 to 8 mins versus 25 to 35 mins

2018-08-15 14:33
thats good :slightly_smiling_face:

shane
2018-08-15 14:34
we're working on some patterns with live boot-to-KRIB-image images, and kernel kexec patterns to radically accelerate this too - but those are probably a few months off

2018-08-15 14:34
ok

2018-08-15 14:35
i have this in ansible so probabily will be easier to use the kubespray content package

greg
2018-08-15 14:35
@ - there is a helm chart for ceph floating around that runs ceph in the cluster and adds it to the core.

2018-08-15 14:36
i'll try to find it

2018-08-15 14:36
thank you

2018-08-15 17:55
@shane when I run `docker run digitalrebar/provision`....should i be able to access the portal via localhost:8092?

zehicle
2018-08-15 17:56
did you map the ports?

2018-08-15 17:56
its in the Dockerfile so i assume they are mapped

2018-08-15 17:56
looks like they are mapped

2018-08-15 18:05
raspberry pi

greg
2018-08-15 18:10
:slightly_smiling_face:

2018-08-15 18:12
the libc was the issue for docker on arm

zehicle
2018-08-15 18:14
@ considering the footprint on RPi, running the golang directly w/o Docker would save overhead

zehicle
2018-08-15 18:14
but I'm glad to see you got it working

2018-08-16 15:10
does sledgehammer can boot a RPi?

shane
2018-08-16 15:11
@ - at the moment, we do not support booting ARM based hardware - we are working on this feature - no specifics on release of it though - DRP Endpoint itself does support running on an ARM platform - and we have community and customers that have used RPi platforms successfully

2018-08-16 15:16
thanks @shane, can I at least serve a netboot image through drp so the Rpi can boot? or its all based on iPXE ?

shane
2018-08-16 16:18
@ - not entirely sure ... I'd have to check w/ @vlowther if he has any thoughts on that - we did just add (experimental) Apple BSDP support for Apple NetBoot and NetInstall capabilities - that's in the current `tip` version (and _very_ experimental right now)

2018-08-16 16:20
:thumbsup: thanks, i'm looking into terraform to power on/off and rest the pi cluster...I Can do it with commands

shane
2018-08-16 16:37
looks like slack finally came back from a siesta ...

shane
2018-08-16 16:37

vlowther
2018-08-16 16:39
@ If you can point me at something that describes the netboot protocol an rpi uses, I can tell you how hard it would be to get dr-provision to netboot it.

2018-08-16 16:39
awesome


2018-08-16 16:41
it uses a bootcode.bin file that is served via the tftp

vlowther
2018-08-16 16:44
ok, so slap that and the associated start.elf (as mentioned in https://www.raspberrypi.org/documentation/hardware/raspberrypi/bootmodes/net.md) under /var/lib/dr-provision/tftpboot and see what happens.

vlowther
2018-08-16 16:45
That will not help with getting Sledgehamemr to run on it for a huge variety of reasons, but it is a valid smoketest.

vlowther
2018-08-16 16:46
hm

vlowther
2018-08-16 16:48
reading more at that doc, the Pi signals that it is an x86 box running in legacy BIOS mode, with a magic extra vendor options.

vlowther
2018-08-16 16:49
sighs

vlowther
2018-08-16 16:50
There is a standard way in the PXE spec of signalling that you are an arm/arm64 box, guys.

vlowther
2018-08-16 16:51
instead, based on the incoming DHCP request I have no wqay of knowing if hte box is an RPi or not.

vlowther
2018-08-16 16:51
I just have to supply the magic vendor option in the offer and hope.

vlowther
2018-08-16 16:53
They could have sent a vendor option in the DHCP request that indicates the box is an RPi, but nooo.

vlowther
2018-08-16 16:54
gah, whoever hacked that into the rpi firmware is an even worse firmware coder than usual.

vlowther
2018-08-16 16:59
Anyways, you will also need to throw an option 43 into whatever subnet you want to boot your rpis from that has the magic value `Raspberry Pi Boot` except maybe with 3 extra spaces at the end depending on how old your firmware and/or rpi is.

vlowther
2018-08-16 17:02
With that, a `/bootloader.bin` in the static file space (`/var/lib/dr-provision/tftpboot` by default), and the `start.elf` wherever it belongs, your RPi should then try to boot off whatever NFS share you configured for it to netboot from.

vlowther
2018-08-16 17:04
From there, getting it to something vaguely resembling Sledgehammer is an epic journey all on its own.

vlowther
2018-08-16 17:04
Mostly because we don't have one. :slightly_smiling_face:

vlowther
2018-08-16 17:05
/rant off (for now)

2018-08-16 17:08
thanks

2018-08-16 17:08
i'll give it a shot

2018-08-16 17:09
sledgehammer isn't too important...i'm thinking to boot rancheros for rpi

2018-08-16 21:40
hi all, it?s really strange when i display the machine profile via UUID, but i can?t destroy it. any thought?

2018-08-16 21:40
$ drpcli machines show 838e512a-70f0-4483-b4a5-fcbed5401912 { ?Address?: ?10.11.106.118", ?Available?: true, ?BootEnv?: ?wmps-install?, ?CurrentJob?: ??, ?CurrentTask?: -1, ?Description?: ??, ?Errors?: [], ?HardwareAddrs?: [], ?Meta?: { ?feature-flags?: ?change-stage-v2?

2018-08-16 21:40
$ drpcli machines destroy 838e512a-70f0-4483-b4a5-fcbed5401912 Error: DELETE: key 838e512a-70f0-4483-b4a5-fcbed5401912: not found

vlowther
2018-08-16 21:43
hm... Any errors in the dr-provision log?

2018-08-16 21:49
i can?t find the error logs; just see job-logs directory and ton of files in there

shane
2018-08-16 21:51
@ - logs would be wherever you send STDOUT when starting up the DRP service in isolated mode, or in the Journal logs in systemd based systems if you start with systemctl

2018-08-16 21:52
got it; thanks

2018-08-16 22:02
i don?t see anything information related or indicate to destroy the machine at all

2018-08-16 22:03
Aug 16 22:02:22 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:02:22.725844 [64175:2577]static [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/tftp.go:82 Aug 16 22:02:22 labs-provision dr-provision[2207]: [64175:2577]TFTP: lpxelinux.0: transfer error: sending block 0: code=0, error: TFTP Aborted

2018-08-16 22:03
tjat

shane
2018-08-16 22:03
that's normal

shane
2018-08-16 22:03
that's how TFTP likes to get a file size for transfer since TFTP itself (as a protocol) doesn't support file size request operation


2018-08-16 22:04
journalctl -u dr-provision.service -f and then $ drpcli machines destroy 838e512a-70f0-4483-b4a5-fcbed5401912 Error: DELETE: key 838e512a-70f0-4483-b4a5-fcbed5401912: not found

2018-08-16 22:04
i see nothing in the logs

shane
2018-08-16 22:05
hmm - try to bump the "default log level" up to `debug` in Info & Preferences - then try again - if nothing - try `trace`

shane
2018-08-16 22:06
don't leave it at `trace` long - that will significantly impact the performance of your DRP endpoint

shane
2018-08-16 22:06
can you also please do an: `ls -l /var/lib/dr-provision/digitalrebar/machines/838e512a-70f0-4483-b4a5-fcbed5401912.json`

shane
2018-08-16 22:07
that should be the write layers stored copy of the JSON object for the Machine

shane
2018-08-16 22:07
(you may have relocated /var/lib/dr-provision from default ... )

2018-08-16 22:08
the file 838* does not exit on that directory

shane
2018-08-16 22:09
assuming you have more than this one machine on the DRP endpoint - can you see if there are JSON files in that directory ?

2018-08-16 22:10
yes, we have 169 file total

2018-08-16 22:11
$ ls -l /var/lib/dr-provision/digitalrebar/machines/8 81533173-5ac5-4826-9178-d88cfd22d7c7.json 8739d042-678b-48ff-9694-a69fe03da668.json 8ce1c027-25ce-429e-9c90-69c70b2b35dd.json 8f72549b-8344-4a64-83de-06c1a5d9404e.json 84101a29-daf2-4e04-bb88-fd7b816c2500.json 8acfeab3-d3c7-4c92-823c-4bcb4d7790a1.json 8e95bcfd-c915-464b-ad69-832935350234.json 84bc7826-1d6c-46c5-8716-5f5b2b28f3f8.json 8b9e9ea4-a146-460f-891f-7a53023e7d2c.json 8ecc7936-5283-47a9-8ae9-dcdb788c8f5d.json

2018-08-16 22:11
$ ls -l /var/lib/dr-provision/digitalrebar/machines/83 ls: cannot access ?/var/lib/dr-provision/digitalrebar/machines/83?: No such file or directory

shane
2018-08-16 22:11
hmm - so the write file store copy of that machine has been deleted it sounds like

shane
2018-08-16 22:12
when was last time you restarted that DRP endpoint - and what version is it running ?

shane
2018-08-16 22:12
`drpcli info get | jq .version`

2018-08-16 22:12
last restarted on jul-25

2018-08-16 22:13
?v3.8.0-0-1e4c58d48257bdb9562126fefc89897bae23210e?

shane
2018-08-16 22:13
did bumping logging level up give any better info ?

2018-08-16 22:15
if you don?t mind to show me how to bump logging level

shane
2018-08-16 22:15
are you using drpcli or UI ?

2018-08-16 22:16
drpcli

shane
2018-08-16 22:22
`drpcli prefs set '{"logLevel":"debug"}'`

shane
2018-08-16 22:23
weird ... that one doesn't have a "helper" ... so normally we can do something like: `drpcli prefs set debugFrontend warn`

shane
2018-08-16 22:23
but the `loglevel` field doesn't have a helper like that - so we have to use the JSON format to make it happy

2018-08-16 22:26
alright the log level is set to debug mode

2018-08-16 22:28
i?m running $ drpcli machines destroy 838e512a-70f0-4483-b4a5-fcbed5401912 Error: DELETE: key 838e512a-70f0-4483-b4a5-fcbed5401912: not found

2018-08-16 22:28
# journalctl -u dr-provision.service -f -- Logs begin at Wed 2018-07-25 15:10:54 UTC. -- Aug 16 22:20:10 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:20:10.934197 [64306:2592]frontend [audit]: /home/travis/gopath/src/github.com/digitalrebar/provision/frontend/frontend.go:359 Aug 16 22:20:10 labs-provision dr-provision[2207]: [64306:2592]Authenticated user ben from 10.10.135.119 Aug 16 22:21:32 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:21:32.121559 [64435:2593]static [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/tftp.go:82 Aug 16 22:21:32 labs-provision dr-provision[2207]: [64435:2593]TFTP: lpxelinux.0: transfer error: sending block 0: code=0, error: TFTP Aborted Aug 16 22:23:16 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:23:16.754754 [64448:2594]static [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/tftp.go:82 Aug 16 22:23:16 labs-provision dr-provision[2207]: [64448:2594]TFTP: lpxelinux.0: transfer error: sending block 0: code=0, error: TFTP Aborted Aug 16 22:25:01 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:25:01.020282 [64456:2595]static [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/tftp.go:82 Aug 16 22:25:01 labs-provision dr-provision[2207]: [64456:2595]TFTP: lpxelinux.0: transfer error: sending block 0: code=0, error: TFTP Aborted Aug 16 22:26:45 labs-provision dr-provision[2207]: dr-provision2018/08/16 22:26:45.149912 [64469:2596]static [error]: /home/travis/gopath/src/github.com/digitalrebar/provision/midlayer/tftp.go:82 Aug 16 22:26:45 labs-provision dr-provision[2207]: [64469:2596]TFTP: lpxelinux.0: transfer error: sending block 0: code=0, error: TFTP Aborted

shane
2018-08-16 22:29
nope - nothing there

2018-08-16 22:30
yeah

shane
2018-08-16 22:31
kick to `trace` and hit it fast to get logs, then kick it back down to `warn`

2018-08-16 22:32
ok

2018-08-16 22:33
nothing happen after run $ drpcli machines destroy 838e512a-70f0-4483-b4a5-fcbed5401912 Error: DELETE: key 838e512a-70f0-4483-b4a5-fcbed5401912: not found

2018-08-16 22:33
$ drpcli prefs set ?{?logLevel?:?trace?}? { ?baseTokenSecret?: ?OxCwy6xWaMMR3i-CKCk54GiFwjVWa5IC?, ?debugBootEnv?: ?warn?, ?debugDhcp?: ?warn?, ?debugFrontend?: ?info?, ?debugPlugins?: ?warn?, ?debugRenderer?: ?warn?, ?defaultBootEnv?: ?local?, ?defaultStage?: ?none?, ?knownTokenTimeout?: ?3600", ?logLevel?: ?trace?,

shane
2018-08-16 22:34
ok - lets make sure loglevel is back to 'warn' - then try Frontend - so: ```drpcli prefs set '{"logLevel":"warn"}' drpcli prefs set debugFrontend trace``` hit it with the Delete, then set your Frontend back to "warn": ```drpcli prefs set debugFrontend warn```

2018-08-16 22:35
i did that

2018-08-16 22:38
when i run drpcli prefs set debugFrontend trace and $ drpcli machines destroy 838e512a-70f0-4483-b4a5-fcbed5401912

2018-08-16 22:39
the command is just hanging there without throw out the ?Error?? messages

shane
2018-08-16 22:40
hmm - hit it w/ drpcli again and drop debugFrontend back down to warn

2018-08-16 22:44
now $ drpcli prefs set debugFrontend warn is hanging too

shane
2018-08-16 22:44
trace on Frontend may have been too aggressive, and we may have wedged DRP :disappointed:

shane
2018-08-16 22:45
there are a few fixes around that between v3.8.0 and current v3.10.0 versions

2018-08-16 22:45
i can?t run any commands rn.

2018-08-16 22:46
all hanging

shane
2018-08-16 22:46
unfortunately - you'll need to jump on the DRP endpoint and kill the process and restart ...

shane
2018-08-16 22:47
on the plus side, it's entirely possible that'll clear up the machine delete issue - we haven't seen that happen in the field before though - so was really hoping to get a log trace

2018-08-16 22:48
it seems like drpcli prefs set debugFrontend trace caused the dr-provision service hangs

shane
2018-08-16 22:48
yes - trace level is pretty aggressive

2018-08-16 22:50
restarted dr-provision but it?s not working

shane
2018-08-16 22:52
ok - kill process - then check the file / edit to change to "warn": `/var/lib/dr-provision/digitalrebar/preferences/debugFrontend.json`

shane
2018-08-16 22:53
should look like: `{"Meta":{},"Name":"debugFrontend","Val":"warn"}`

shane
2018-08-16 22:53
after you edit it - then start DRP

2018-08-16 22:54
alright it?s back to normal

shane
2018-08-16 22:55
check if that Machine object exists now (`drpcli machines show <uuid>`)

2018-08-16 22:55
it?s no longer see that UUID

2018-08-16 22:56
ls

shane
2018-08-16 22:56
so - the Machine object existed in the in-memory layers of the Backing Store ... but had been deleted from Disk at some point ... either external action, or somehow by drpcli command - without deleting the object in the in-memory

shane
2018-08-16 22:57
we've never seen that with DRP before ... so really curious how it got in to that state ...

2018-08-16 22:57
agree

2018-08-16 22:59
thanks a lot for all your help!

shane
2018-08-16 22:59
a simple test of this via external action shows how we can get in to this state: ```root@demo:~/drp-data/digitalrebar/machines# drpcli machines show 53a4fb08-4184-4158-8e76-24843fb8cc58 | head -3 { "Address": "1.2.3.4", "Available": true, root@demo:~/drp-data/digitalrebar/machines# rm 53a4fb08-4184-4158-8e76-24843fb8cc58.json root@demo:~/drp-data/digitalrebar/machines# drpcli machines show 53a4fb08-4184-4158-8e76-24843fb8cc58 | head -3 { "Address": "1.2.3.4", "Available": true, root@demo:~/drp-data/digitalrebar/machines# ls -l 53a4fb08-4184-4158-8e76-24843fb8cc58.json ls: cannot access 53a4fb08-4184-4158-8e76-24843fb8cc58.json: No such file or directory root@demo:~/drp-data/digitalrebar/machines# drpcli machines destroy 53a4fb08-4184-4158-8e76-24843fb8cc58 Error: DELETE: remove /root/drp-data/digitalrebar/machines/53a4fb08-4184-4158-8e76-24843fb8cc58.json: no such file or directory```

shane
2018-08-16 23:00
so in that case, the on disk version was destroyed, but the in-memory version still exists

shane
2018-08-16 23:00
not sure if that's the case in your instance - but one path to get there

2018-08-16 23:00
i see, in my case i was using drpcli machines destroy

2018-08-16 23:02
first i was change the bootenv but it didn?t work so i decided to destroy and create it again, somehow, i ran into that error

shane
2018-08-16 23:02
ok - please let us know if you see it again - we'd like to understand what might have happened ...

2018-08-16 23:02
ok

zehicle
2018-08-17 00:45
@ I'd strongly recommend upgrading for community support. If you have to stay with 3.8, you may want to consider commercial support.

2018-08-17 15:13
hmm.. is the bsdtar binary name hardcoded in drp? Just realised that on SL6 bsdtar is provided as ?bsdtar3? and the binary is named that too :confused:

greg
2018-08-17 15:43
Probably. :confused:

2018-08-17 15:45
just symlinked around it :smile:

2018-08-17 15:48
just manual install-ing with ansible now

2018-08-17 15:48
the install script makes some fair assumptions which don?t always hold true

2018-08-17 15:49
(like? it prefers upstart over sysv but rhel6 has both, and for some reason upstart doesnt work with the given config :D)

2018-08-17 15:49
the only reason this is a faff is because I?m trying to install DRP on our production deployment systems *without* blowing them away and redeploying on c7 (yet)

greg
2018-08-17 15:51
pR with notes would be helpful. If you can

2018-08-17 15:56
it?ll have to wait until monday but sure :slightly_smiling_face:

2018-08-17 16:36
@shane this happened again; i can?t destroy the machine with UUID

2018-08-17 16:36
$ drpcli machines destroy 7aca0add-b2e5-4ea0-8a82-fe5e65e149a3 Error: CLIENT_ERROR Cannot handle content-type text/plain; charset=utf-8 No decoder for content-type text/plain; charset=utf-8

2018-08-17 16:36
but i can view the profile

2018-08-17 16:36
$ drpcli machines show 7aca0add-b2e5-4ea0-8a82-fe5e65e149a3 { ?Address?: ?10.11.106.118", ?Available?: true, ?BootEnv?: ?wmps-install?, ?CurrentJob?: ??, ?CurrentTask?: -1, ?Description?: ??, ?Errors?: [], ?HardwareAddrs?: [], ?Meta?: { ?feature-flags?: ?change-stage-v2?

shane
2018-08-17 17:23
@ - do you know what steps you may have performed at the Console/Shell, in `drpcli`, or UI ... that might have led to this ? We're looking at the backing store code and our handling around this, but haven't found any internal patterns yet that would lead to this condition where the back JSON object on disk is deleted while the in-memory object model is still "laying around" ...

2018-08-17 18:08
run few more tests and results are the same, here are steps to reproduce it.

2018-08-17 18:08
Restart dr-provision service (stop/start) Create a new machine Make sure the machine UUID existing in the /var/log/dr-provision/digitalrebar/machines/ Check machine?s profile; everything looks good. Delete a machine and got an error messages Error: CLIENT_ERROR Cannot handle content-type text/plain; charset=utf-8 No decoder for content-type text/plain; charset=utf-8 Check the machine profile; it looks good. Check the machine?s UUID in the /var/log/dr-provision/digitalrebar/machines/; it?s no longer there. Restart dr-provision service and then the machine?s profile is no longer exists

2018-08-17 18:11
$ drpcli version Version: v3.8.0-0-1e4c58d48257bdb9562126fefc89897bae23210e

vlowther
2018-08-17 18:27
hm.

vlowther
2018-08-17 18:28
If you upgrade to 3.10, does the same thing happen?

zehicle
2018-08-18 19:39
@johnsutten - did you manage to reset your login?

zehicle
2018-08-18 20:37
Just did a complete KRIB run through - https://youtu.be/rzBq3BsYQTM

johnsutten
2018-08-18 21:24
@zehicle I made it back

johnsutten
2018-08-18 21:27
as always you and the team have been busy

johnsutten
2018-08-18 22:04
i think this is right... https://imgur.com/a/mT8PQGu

zehicle
2018-08-18 23:38
are you using latest? what are you trying to do?

johnsutten
2018-08-19 00:37
working on a cluster...

johnsutten
2018-08-19 00:38
i was able to get a PXE boot and setup a workflow to install ubuntu 16.04 and it hangs on preseed at 16%

johnsutten
2018-08-19 00:38
I am using the steps to install for a production install

johnsutten
2018-08-19 00:41
the other question is how do i remove a step from a workflow i created

johnsutten
2018-08-19 00:52
fixed the workflow - it was faster to recreate than to edit and apply

zehicle
2018-08-19 02:27
workflow in the UX is drag and drop

zehicle
2018-08-19 02:28
I think ubuntu requires a DNS and/or external access. you may need to set your DHCP to find an external path


johnsutten
2018-08-19 03:59
i am sure im missing something in the /etc/network/interfaces..

johnsutten
2018-08-19 03:59
# The primary network interface allow-hotplugs enp0s8 iface enp0s8 inet dhcp # Secondary allow-hotplugs enp0s9 iface enp0s9 inet static address 10.0.0.1 netmask 255.255.255.0

johnsutten
2018-08-20 00:25
im going to install debian 9.5 and get back with you

johnsutten
2018-08-20 01:21
I think this config might work a bit better anyone have suggestions? epn0s8 is on the internet and 10.0.0.x is the dhcp pxe server and when machines are provisioned they can communicate to the outside world

johnsutten
2018-08-20 01:21
# The primary network interface allow-hotplug enp0s8 iface enp0s8 inet static address 192.168.1.12 netmask 255.255.255.0 broadcast 192.168.1.255 gateway 192.168.1.1 dns-nameservers 192.168.1.1 # Secondary allow-hotplug enp0s9 iface enp0s9 inet static address 10.0.0.1 netmask 255.255.255.0 post-up iptables -t nat -A POSTROUTING -o enp0s8 -j SNAT --to-source 192.168.1.12 post-down iptables -t nat -D POSTROUTING -o enp0s8 -j SNAT --to-source 192.168.1.12

johnsutten
2018-08-20 13:07
morning all

johnsutten
2018-08-20 13:10
in setting up my workflow i may have put too many steps in? https://imgur.com/a/PzFubld

zehicle
2018-08-20 16:14
it's just the way the render bunches up the icons. I need to add a nowrap style

zehicle
2018-08-20 16:14
if you check the row, it will show you the stages on the right

johnsutten
2018-08-20 16:15
i updated my ssh-key in the global template. the sshkey goes after the ubuntu install?

zehicle
2018-08-20 16:17
if you want they keys installed in the OS

zehicle
2018-08-20 16:17
it would be in twice if you build a workflow that's both discover AND install ubuntu as a single thing

johnsutten
2018-08-20 16:17
a shorter and more efficient workflow could be - sledgehammer- wait, ubuntu, ssh-key and then the complete to keep runner going?

zehicle
2018-08-20 16:17
assuming you want your keys injected into sledgehammer in that case

zehicle
2018-08-20 16:18
you need to add a runner install stage if you want the DRP runner to keep working in the final OS

zehicle
2018-08-20 16:19
you really only need sledgehammer-wait if you are staying in sledgehammer. it's never mid-workflow

zehicle
2018-08-20 16:19
it really does not do much

johnsutten
2018-08-20 16:19
i have been watching the videos and you guys have been busy. what is the benefit of the DRP runner?

zehicle
2018-08-20 16:19
it's required if you want workflows

zehicle
2018-08-20 16:20
once it's off, DRP cannot control the system at all.

zehicle
2018-08-20 16:21
DRP runner = DRP agent = DRP CLI. All the same thing

johnsutten
2018-08-20 16:22
ok i am not seeing drp as an option for workflow

zehicle
2018-08-20 16:24
??

zehicle
2018-08-20 16:25
it's automatically started in all of the default templates

zehicle
2018-08-20 16:25
in O/S installs, it will terminate by default after all the post-provision work is complete UNLESS you install it as a service.

johnsutten
2018-08-20 16:27
i was following your 'ROSE' video i think there was some steps im left with question on

zehicle
2018-08-20 16:28
then it will keep running and automatically detect when workflows changes (including reboots)

zehicle
2018-08-20 16:30
ROSE would be a pretty advanced place to start.

zehicle
2018-08-20 16:30
what are you trying to accomplish?

johnsutten
2018-08-20 16:46
i am trying to create a lab that i can use to deploy whatever i need. ideally... I like what you did with ROSE as with openstack i can deploy anything i need

johnsutten
2018-08-20 16:47
my original goal had to be put on hold from when i had http://iamkey.org

johnsutten
2018-08-20 16:47
at least now i can walk and move about but cant lift anything less than 30 lbs

johnsutten
2018-08-20 16:48
the goal was to use this tool to setup learning environment configurations. maybe someone needs help with windows or linux commands

johnsutten
2018-08-20 16:50
it would probably be faster by phone to say it than to put everyone else to sleep here

zehicle
2018-08-20 18:15
we're booked this afternoon - would have to be tomorrow for voice. my point w/ ROSE was to encourage you to try smaller steps first. Make sure base Ubuntu provision works first

johnsutten
2018-08-20 18:21
for whatever reason my ubuntu deployment is showing ssh access with no boot environment for the machine when i look at overview.. when i go to machines everything is good to go and green but when i ssh to the box connection is refused. i am using the default for now if you want to look @ http://www.networked.pw

zehicle
2018-08-20 18:33
you need to install your SSH keys in the global profile and include the SSH keys stage


johnsutten
2018-08-20 18:38
i did that in the global i thought.. maybe i didnt click save?

johnsutten
2018-08-20 18:38
nope its there

zehicle
2018-08-20 18:59
check the job logs for the SSH stage. It may have an error that helps show what happened

johnsutten
2018-08-20 19:47
i just went and checked and ubuntu 18 was stuck on pressed so i just rebooted the client

johnsutten
2018-08-20 20:07
went to check the process and it was going along fine and still stuck on preseed... if i deploy a machine with sledgehammer i can ssh key to it

greg
2018-08-20 20:13
Make sure your workflow has a finish-install or complete at the end otherwise the install will hang. If you change the workflow, you will need to change the machine out of that workflow and then back into. Then reboot the machine.

johnsutten
2018-08-20 20:53
@greg The workflow i have now is install ubuntu18, ssh access and finish install - i did update the global so it has my ssh key - am i missing anything?

greg
2018-08-20 20:59
That should be okay. Assuming you don?t want to run anything in the rebooted installed os.

zehicle
2018-08-20 20:59
typically, we also include the DRP runner stage so that you can change workflow and have it detect that

johnsutten
2018-08-20 21:00
@greg what do i need to run something in the rebooted install os

greg
2018-08-20 21:00
That is what @zehicle is talking about.

greg
2018-08-20 21:00
Add the `task-library` content package.

greg
2018-08-20 21:01
That will add the `runner-service` stage. Put that before `finish-install` but after `ubuntu-18.04-install`. That will put the runner in the post-install environment and start it on put.

greg
2018-08-20 21:01
Then you put `complete` after `finish-install`

greg
2018-08-20 21:01
Once a system gets to `complete` you will know it has booted and checked in.

greg
2018-08-20 21:01
Anything you want to do post-install can be added between `finish-install` and `complete`.

johnsutten
2018-08-20 21:05
weird - the task-library wont add

zehicle
2018-08-20 21:06
check the Logs under the Endpoint Admin section.

johnsutten
2018-08-20 21:07
2 [15:08] xid 0xd606ccd7: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [14:24] xid 0xe8782bdf: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [12:33] xid 0x6b66dfba: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [11:47] xid 0xef1bcb2c: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [10:03] xid 0xb12354f0: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [09:54] xid 0xb17f1a17: Another DHCP server may be on the network: <nil> dhcp dhcp.go:862 [09:34] xid 0x6b88c76b: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [07:34] No DRP authentication token from 76.125.103.251 frontend frontend.go:287 [06:58] xid 0xd334a0f6: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [06:44] xid 0x176e3f6b: Ignoring request for DHCP server 192.168.1.1 dhcp dhcp.go:834 [21:08] Authenticated user rocketskates from 127.0.0.1 frontend frontend.go:283 3 [20:45] Authenticated user rocketskates from 76.125.103.251 frontend frontend.go:283 [20:32] Authenticated user rocketskates from 127.0.0.1 frontend frontend.go:283

johnsutten
2018-08-20 21:07
nothing useful

johnsutten
2018-08-20 21:08
i got it

johnsutten
2018-08-20 21:36
i have a green status that it is all done, but when i look at the overview it is yellow... im confused

johnsutten
2018-08-20 21:40
when i still try to ssh - connection refused

zehicle
2018-08-21 16:03
@johnsutten colors are generally (not always) determined by the item being shown. So overview is showing you stages. If your ending stage is yellow (e.g. sledgehammer-wait is an intermediate state) then overview will show yellow.

zehicle
2018-08-21 16:03
for the bulk page, the "all tasks are done" color is green

johnsutten
2018-08-22 15:38
Morning all -

johnsutten
2018-08-22 15:39
do we know if there is any methods to utilize this tool without use PXE and just send a command to add a node?

greg
2018-08-22 16:06
drpcli machines create

johnsutten
2018-08-22 16:17
so if i had machines 192.168.1.202 - 250 - how could i use the tool to provision machines ? i wouldnt want another endpoint just be able to manage the servers

greg
2018-08-22 16:20
you could use DRP DHCP reservations to assign mac to IP.

greg
2018-08-22 16:20
You can also use DRP Subnet ACtive range to scope the IP addresses handed out by DRP DHCP.

shane
2018-08-22 16:27
@johnsutten you can create a Machine Object to represent a physical machine - without booting the machine in to sledgehammer - basically - take a look at `drpcli machines show Name:<MACHINE_NAME>` - save that to a file, modify appropirately, then create with that JSON as input

florent.wagener
2018-08-22 19:17
hey guys, I am currently trying to check if a param exist in a machine. The thing is I do not know the name of the parameter. This is a generated parameter name which is concatenating interfaces name with a string. Typically I have these parameters: ```eno1/lldp/chassis eno2/lldp/chassis etc...``` To make it simple, I have a stage that is retrieving the lldp information of all interfaces and store the chassis and port of each interface in a param in the machine object in DRP. What I want to do after that is to compare the previously retrieved lldp informations with what is stored in our DCIM. The thing is that I want to ensure that the lldp parameters of each interface is there before doing the comparison, so I am trying to write a piece of python code that look like this: ``` def nic_inventory_check(nic_list): exit_message = [] exit_code = [] for nic in nic_list: {{if .ParamExists "{0}/lldp/port".format(nic)}} exit_message.append(("Param {0}/lldp/port found.".format(nic))) exit_code.append(0) yield {'{0}/lldp/port'.format(nic)): {{.Param "{0}/lldp/port".format(nic)}}} {{else}} exit_message.append(("Param {0}/lldp/port not found.".format(nic))) exit_code.append(2) yield 'error' {{end}} {{if .ParamExists "{0}/lldp/chassis".format(nic)}} exit_message.append(("Param {0}/lldp/chassis found.".format(nic))) exit_code.append(0) yield {'{0}/lldp/chassis'.format(nic)): {{.Param "{0}/lldp/chassis".format(nic)}} {{else}} exit_message.append(("Param {0}/lldp/chassis not found.".format(nic))) exit_code.append(2) yield 'error' {{end}} ``` Of course this piece of code isn't accepted by DRP :`{{if .ParamExists "{0}/lldp/port".format(nic)}}`. It is telling me that there is a parsing error: ```Parse error for template ubisoft-lldp-check.py.tmpl: template: ubisoft-lldp-check.py.tmpl:88: unexpected . after term "\"{0}/lldp/port\""```

florent.wagener
2018-08-22 19:18
I assume that mixing python code with `golang/text template` isn't working. Do you guys have an idea on how I could make this work ?

shane
2018-08-22 19:36
@florent.wagener - correct - you can not mix Golang Templating and Python. The golang template piece is interpreted on the DRP Endpoint _prior_ to being handed to the Machine, and subsequently the Python interpreter.

shane
2018-08-22 19:37
In this case - you would probably want to make logic choices and local machine checks first in Python - then use `drpcli` to call to the DRP Endpoint to get the params info. You'll need to generate a Token for use in the Python piece, to pass to `drpcli` to authenticate. See the `setup.tmpl` template for a Bash example.

shane
2018-08-22 19:38
We do have a pattern in the Inventory content that is designed to store and check information related to hardware and state changes in the hardware, using DRP itself as the data store. You might want to see if this pattern is adaptable to what you are doing.


shane
2018-08-22 19:41
We have also added a lot of Golang Template halpers in the most current version of DRP (v3.10.0) that extends and adds a lot of capabilities the server-side golang template processing. This won't necessarily help you with this specific task, but you might want to take a look at it for future reference and templating work: http://masterminds.github.io/sprig/

shane
2018-08-22 19:42
additionally - we have some work around LLDP we've done already as part of the Network menu functionality

shane
2018-08-22 19:50
Also - you can use a consistently named Param to store the information in - but use a more complex structure to store the dynamic details in that Param - you can then use a `{{range ... }}` golang template structure to iterate over each of the key/value pairs - and pass that in for use in Python - here's an example of ``range``


florent.wagener
2018-08-22 19:50
thanks @shane This is usefull informations, I'm gonna read all of that and keep you posted :slightly_smiling_face:

vlowther
2018-08-22 20:01
The other thing you could do is store the LLDP info for all of the nics as a single JSON blob

vlowther
2018-08-22 20:01
in a single param

shane
2018-08-22 20:01
^^^ "consistently named Param to store information in" :slightly_smiling_face:

florent.wagener
2018-08-22 20:03
@vlowther I could do that too yes...

vlowther
2018-08-22 20:03
then you can have one {{.ParamAsJSON "param-name"}} call to get the param as a JSON blob, and do all your iterating in Python.

zehicle
2018-08-22 20:04
I think we have a lldp json stage

florent.wagener
2018-08-22 20:05
@zehicle yes you did, but this is much like the gohai stage. There is a lot of information I don't need int that param: That's why I am interested in the inventory stage that @shane mentionned.

zehicle
2018-08-22 20:13
yes. My expectation was that you would be able to use the adhoc jq filters from the inventory on lldp data to add more keys into your inventory/data.

zehicle
2018-08-22 20:13
the inventory stage design allows you to specify any command that creates json and then apply jq to filter down to critical fields.

florent.wagener
2018-08-22 20:13
oooh

florent.wagener
2018-08-22 20:14
that's fancy

florent.wagener
2018-08-22 20:14
Is there any example ? not sure how to do it though.


florent.wagener
2018-08-22 20:32
This is amazing.

florent.wagener
2018-08-22 20:33
I'm gonna give this a try first.

florent.wagener
2018-08-22 20:34
This could be the answer for a lot of thing I am already doing and some I am not yet doing :smile:

zehicle
2018-08-23 04:39
SPRIG ROCKS

zehicle
2018-08-23 04:48
Fun little integration for KRIB and inventory.... if you use the inventory stage (creates inventory/data from gohai) then KRIB will automatically use that data to label to nodes.

zehicle
2018-08-23 04:49
That means that you can use the inventory information to select nodes for scheduling in Kubernetes automatically

2018-08-23 12:44
has joined #community201808

2018-08-23 14:50
has joined #community201808

florent.wagener
2018-08-24 14:24
Hi guys, working on what we talked on Wednesday. And I got an issue with `inventory/collect`: here's my config: ``` "inventory/collect": { "gohai": { "Command": "drpcli gohai", "Fields": { "Interface": ".Networking.Interfaces[].StableName", "SerialNumber": ".DMI.System.SerialNumber" } }, ``` Testing `".Networking.Interfaces[].StableName"` is working well on the target server without issue, but when rolling inside a stage/workflow, I got this error: ``` === COLLECTING INVENTORY ==== Parsing Inventory Group gohai for Machine 8af50e62-70f1-42b6-ad68-1297df75ae8d Collecting Data with cmd: drpcli gohai Starting Collection Loop ... Interface = "" "eno1" "eno2" "eno3" "eno4" "eno49" "eno50" (from jq .Networking.Interfaces[].StableName) jq: error: syntax error, unexpected QQSTRING_START, expecting '}' (Unix shell quoting issues?) at <top-level>, line 2: "eno1" jq: 1 compile error Command exited with status 3``` Is this possible that I miss something or maybe I found a bug ?

zehicle
2018-08-24 15:15
jq can be frustrating

zehicle
2018-08-24 15:16
one suggestion for working on syntax, use `drpcli gohai | jq [query]` to work out the queries

zehicle
2018-08-24 15:17
you should also look at how the template is rendered to help find syntax. The test UX has that feature or the CLI can show you templates for jobs too

zehicle
2018-08-24 15:24
@romain.lafontaine it would help if I could see the raw data dumps from the system. This type of jq parse issue is often related to the data being different between systems. you may need additional jq pipeline steps

zehicle
2018-08-24 15:25
sorry @florent.wagener ^^ (sorry, Romain and I have a different thread going)

romain.lafontaine
2018-08-24 15:25
:slightly_smiling_face:

greg
2018-08-24 15:29
@florent.wagener - what are you trying to do?

florent.wagener
2018-08-24 15:29
just gathering the network interface name.

greg
2018-08-24 15:29
names?

florent.wagener
2018-08-24 15:29
messing with inventory/collect

greg
2018-08-24 15:30
I know - I mean you have var that you are creating - `Interface`

greg
2018-08-24 15:31
You are wanting the data from `jq .Networking.Interfaces[].StableName`

florent.wagener
2018-08-24 15:31
yeah

greg
2018-08-24 15:31
This is going to be a not quite list.

greg
2018-08-24 15:31
So, what is Interface - a json array or string with , separated or what?

greg
2018-08-24 15:32
This breaks because it turns into this `{ "Interface": "" "en1" "en2" }`

greg
2018-08-24 15:32
That is not valid json.

florent.wagener
2018-08-24 15:33
```drpcli gohai | jq .Networking.Interfaces[].StableName "" "eno1" "eno2" "eno3" "eno4" "eno49" "eno50" ```

florent.wagener
2018-08-24 15:33
that's the output

greg
2018-08-24 15:33
Yeah - so that isn?t valid json today.

florent.wagener
2018-08-24 15:33
okok

florent.wagener
2018-08-24 15:33
so I am to precise with my request

greg
2018-08-24 15:36
looking at something - @florent.wagener - just a second

greg
2018-08-24 15:42
This will generate a comma separated list with out blanks. `.Networking.Interfaces | map(.StableName|select(length > 0)) | join(",")`

greg
2018-08-24 15:42
That creates `"enp0s3,enp0s8"` on my system.

greg
2018-08-24 15:43
Which is a valid json string and will store in Interface.

greg
2018-08-24 15:43
If you want an array, you would need?

greg
2018-08-24 15:43
`.Networking.Interfaces | map(.StableName|select(length > 0)) `

greg
2018-08-24 15:43
That will generate: ``` [ "enp0s3", "enp0s8" ]```

greg
2018-08-24 15:46
@florent.wagener - does that help?

florent.wagener
2018-08-24 15:48
let me try

florent.wagener
2018-08-24 15:54
This works :slightly_smiling_face:

florent.wagener
2018-08-24 15:57
thanks @greg

2018-08-25 21:39
Is there a trick to getting `package-repositories` with `installSource: true` working correctly with Kickstart-based systems? I don't want the install URL to be the DR server in my case.

2018-08-27 03:46
has joined #community201808

2018-08-27 14:58
has joined #community201808

2018-08-27 15:22
For EFI systems is there a reason to use iPXE vs Grub as the initial bootloader?

2018-08-27 15:29
For whatever it's worth I made a parallel Sledgehammer boot env called Grubhammer :slightly_smiling_face:

greg
2018-08-27 16:31
@ - mostly one set of files.

greg
2018-08-27 16:32
The standard bootenvs know about the files for ipxe and pxelinux.

2018-08-27 16:32
I'm experimenting with grub since the vendor instructions for network boot environments use it


greg
2018-08-27 16:33
okay. If grub asks for common file locations, then it is easy enough to templatized into the standard templates.

2018-08-27 16:34
I got it to work pretty easily

2018-08-27 16:35
Sledgehammer won't SecureBoot but otherwise everything works great

greg
2018-08-27 16:36
okay - cool.

zehicle
2018-08-27 16:45
hello @ $welcome

2018-08-27 16:45
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

vlowther
2018-08-27 18:32
It was mostly that ipxe was more reliable ant easier to understand than Grub2 was when we started writing boot environments.

zehicle
2018-08-27 18:54
TEMPLATE NOTE WITH SPRIG COMMANDS added in v3.10. They are NOT backwards compatible! Your templates will not render in previous versions

zehicle
2018-08-27 23:03
Reminder for Digital Rebar Meeting tomorrow @ 1 central. https://www.meetup.com/digitalrebar/

shane
2018-08-28 16:30
DRP Meetup v024 today at 11:00 am PST, agenda: ? Kubernetes KRIB - Operational Patterns (drain, delete, uncordon, Worker upgrades and adding new) ? Kubernetes KRIB - Sonobuoy and Helm Charts ? No Reboot Installs (kexec) provisioning https://www.meetup.com/digitalrebar/events/lchdhpyxmbpb/

zehicle
2018-08-28 17:47
OMG.... PROVISION WITHOUT REBOOT. That is crazy fast stufg

florent.wagener
2018-08-28 17:51
for baremetal ?

shane
2018-08-28 17:52
yep ... come join us on meetup today in 8 minutes ... :slightly_smiling_face:

florent.wagener
2018-08-28 17:52
damn

florent.wagener
2018-08-28 17:52
im already in

florent.wagener
2018-08-28 17:52
perfect time for a smoke before hearing all of that ^^

greg
2018-08-28 19:20
- per the meet-up, tip is now validated again. Things weren?t really broken more they weren?t passing unit tests. The tip now passes unit tests.

greg
2018-08-28 19:20
version: v3.10.0-tip-59-87533dc67c39ea18cb17c371a9f811d16f786e0d

zehicle
2018-08-28 19:22
Meetup Recording Posted!! https://youtu.be/beGOdRNl24o

2018-08-28 19:45
Are you planning support for Pulumi?

shane
2018-08-28 21:05
Looks like Pulumi is an abstracted containerized "cloud" orchestration tool - it would orchestrate "on top of" Kubernetes (and other cloud providers) to power applications within the KRIB built kubernetes clusters - essentially, it's a replacement for Terraform - to allow use of Code as the DSL, as opposed to Hashicorps DSL (HCL)


2018-08-29 10:10
somehow a spelling error has found itself into the UI tip :smile:

2018-08-29 10:14
oh and usbents

2018-08-29 10:14
subents*

2018-08-29 11:21
so lets say i have an instance of DRP in each DC.. how do i keep the ?custom? stuff in sync? e.g. I have custom params, templates, profiles, workflows etc

2018-08-29 11:21
is that what packages are for?

vlowther
2018-08-29 11:58
Content bundles, yes.

zehicle
2018-08-29 12:44
@ RackN had been building tooling for that

2018-08-29 12:57
had? or has :stuck_out_tongue: I found one of the content upload vids which seems to provide a pretty good overview of how it works :slightly_smiling_face:

zehicle
2018-08-29 13:10
Sorry...has. more typos

zehicle
2018-08-29 13:11
Content bundles for single endpoints. We are working on multi site sync

2018-08-29 13:42
gotcha

2018-08-29 13:43
tbh the multi site thing for me isn?t a big deal, its more as long as I can package up content from one and sync to another that?s all i really need

2018-08-29 13:43
(and by sync i mean manually or whatevs)

zehicle
2018-08-29 13:53
then bundles is perfect

zehicle
2018-08-29 14:19
@ fixes for those typos is moving through the process. You'll need to click Logout to clear the bad menu from your session storage.

2018-08-29 14:22
cool cool :+1:

2018-08-29 15:13
has joined #community201808

2018-08-29 15:43
Hello guys, I'm from Brazil, I'm DevOps at Maxihost Datacenter. Today I'm trying RackN+DigitalRebar, we intend to use Rebar to deploy our physical machines and automate more tasks that we can't do using MaaS.

shane
2018-08-29 15:44
$welcome @

2018-08-29 15:44
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

2018-08-29 15:44
thanks!

2018-08-29 15:46
to go ahead with the POC, I just have a doubt about how Rebars manages the machines' network interfaces

2018-08-29 15:46
today our machines use 2 NICs, the first one for public access/IPMI and the second NIC for PXE boot over a private network

2018-08-29 15:46
Like that

2018-08-29 15:48
is Rebar able to work with this same network topology? I mean, discovering the 2 NICs

zehicle
2018-08-29 15:50
@ discover = configure? The gohai inventory run on the machine will list all the NICs on the system.

zehicle
2018-08-29 15:50
you want to setup the other network interface in the operating system?

2018-08-29 15:50
yes

2018-08-29 15:55
humm, so I have to use the stage discover that includes gohai

zehicle
2018-08-29 15:55
discover includes gohai by default

zehicle
2018-08-29 15:55
yes, you should be using discover / sledgehammer

2018-08-29 16:03
nice, I'm seeing gohai-inventory now in my UI

2018-08-29 16:07
is it possible to set IP to the network interface using GUI like MaaS or just using drpcli ?

shane
2018-08-29 17:42
@ - the Digital Rebar model uses "workflow", which is composed of flexible building blocks (Stages, Tasks, Templates, Profiles, Params, etc). Part of customizing workflow to your use case might include adding a "stage" and "templates" to do the network configuration. You'd drive your machines through the appropriate workflow from the Sledgehammer / Discovery process, to implement the customizations you need ... as you see in the `gohai-inventory`, we model all of the physical hardware components to make them available to workflow to use to make decisions on deployment details.

shane
2018-08-29 17:44
Additionally, we have a "lighter weight" inventory pieces that will summarize "important" fields from the more complex gohai-inventory - which makes it easier to manipulate and use. There are several examples with our current Content Packs that you can review. We also have a "demo" / "training" content pack that you might like to review for how to author content: https://github.com/digitalrebar/colordemo

shane
2018-08-29 17:47
Last - we have a new piece that's not yet production - but will be very instrumental in network configuration - we call it "`NetWrangler`" - which will help to implement complex NIC configurations on machines as part of workflow

shane
2018-08-29 17:48
we discussed NetWrangler in our Community Meetup v020 - meetup recording is at: https://u.rackn.io/mtup20

2018-08-29 18:10
Thank you very much @shane I'm gonna try this now

zehicle
2018-08-29 18:14
@ you may also want to look at this FAQ about using DHCP to assign IPs to machines via automation


2018-08-29 18:18
nice, thanks

2018-08-30 05:01
has joined #community201808

2018-08-30 10:23
NetWrangler looks interesting :+1:

2018-08-30 12:18
Is there a github project for building Sledgehammer? Or documentation of how it's built?


zehicle
2018-08-30 13:04
@ ^^

2018-08-30 13:04
Thanks!

zehicle
2018-08-30 13:04
@ $welcome

2018-08-30 13:04
Digital Rebar welcome information is here > http://rebar.digital/community/welcome.html

greg
2018-08-30 13:13
@ we are trying to move off of that and use drp to build it.

florent.wagener
2018-08-30 18:10
hey guys, I am parsing the documentation regarding golan text/template, and I am having trouble find if it's possible to create `for loop` structure given a json input. Let's say I have this param (called `interfaces`) on a machine: ```[ { 'Name: 'eno2' 'MacAddress': 'ac:1f:6b:48:bc:e3', 'StableName': 'eno2', }, { 'Name':'eth0', 'MacAddress': 'ac:1f:6b:48:bc:e2', 'StableName': 'eno1', } ]``` And I want to use the `curtin/network/template` param to configure my network on my server. So I am going to create a new template file that I assume should look like this if my idea is possible: ``` network: version: 1 config: {{ if .ParamExists "interface" }} {{ range .Param "interfaces" }} - type: physical - name: {{ .Param "interfaces" }} (I'm looking to get the stable name here but don't know how) {{ end }} {{ end }} ```

florent.wagener
2018-08-30 18:12
didn't get any good example of what I want to do here: https://golang.org/pkg/text/template/#hdr-Actions

zehicle
2018-08-30 18:16
yes, you can do it using range functions

zehicle
2018-08-30 18:17
@florent.wagener here's an example

zehicle
2018-08-30 18:18
the new `sprig` functions provide some array options too.

florent.wagener
2018-08-30 18:18
woohooo!

florent.wagener
2018-08-30 18:18
thx @zehicle trying that now!

zehicle
2018-08-30 18:19
the sprig pieces let you do all sorts of crazy manipulations; HOWEVER, THEY REQUIRE v1.10+

zehicle
2018-08-30 18:20
that example includes toString and replace which are both sprig library

zehicle
2018-08-30 19:11
ICYMI video showing how to use kexec to skip boots during provision: https://youtu.be/Xm688Km3N4Y

zehicle
2018-08-30 19:12
requires latest TIP content

2018-08-31 16:44
Hi guys, I'm able to install machines with 2 NICs for via DHCP, is it possible to work with static IPs for access interface (not PXE) or digital rebar supports only DHCP?

zehicle
2018-08-31 16:54
@ yes, it's possible to use static IPs on NICs. There's just no UX / or default way to define it

zehicle
2018-08-31 16:55
because we don't see consistent patterns for defining the networks in the templates

zehicle
2018-08-31 16:56
we've made it very easy to add post-configuration tasks that you can use to define the NICs and pull params from the machine model

2018-08-31 16:57
good, would you have any example?


zehicle
2018-08-31 16:58
shows how to add stages into the system

zehicle
2018-08-31 16:58
including how to read and set params on machiens

zehicle
2018-08-31 16:58
the videos are designed as instruction

zehicle
2018-08-31 16:59
you'll need to create a task that sets the network config.

zehicle
2018-08-31 17:00
if you can figure out a generic pattern for a stage, it could be a great addition to the community content.

2018-08-31 17:02
nice, I'll try this way